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HUM 

METHOD FOR THE PRODUCTION OF GLYCEROL 
BY RECOMBINANT ORGANISMS 
FIELD OF INVENTION 
5 The present invention relates to the field of molecular biology and the use 

of recombinant organisms for the production of desired compounds. More 
specifically it describes the expression of cloned genes for glycerol-3-phosphate 
: dehydrogenase (G3PDH) and glycerols-phosphatase (G3P phosphatase), either 
separately or together, for the enhanced production of glycerol. 
10 BACKGROUND 

Glycerol is a compound in great demand by industry for use in cosmetics, 
liquid soaps, food, pharmaceuticals, lubricants, anti-freeze solutions, and in 
numerous other applications. The esters of glycerol are important in the fat and 
oil industry. 

15 \ Not all organisms have a natural capacity to synthesize glycerol. 

However, the biological production of glycerol is Shown for some species of 
bacteria, algae, and yeasts. The bacteria Bacillus licheniformis and Lactobacillus 
lycopersica synthesize glycerol. Glycerol production is found in the halotolerant 
algae Dunaliella sp. and Asteromonas gracilis for protection against high external 
20 salt concentrations (Ben-Amotz et aL, (1982) Experientia 38:49-52). Similarly, 
various osmotolerant yeasts synthesize glycerol as a protective measure. Most 
strains of Saccharomyces produce some glycerol during alcoholic fermentation* 
and this can be increased physiologically by the application of osmotic stress 
. (Albertynetal.,(1994)Mo/. Cell Biol. 14,4135-4144). Earlier this century 
25 glycerol was produced commercially with Saccharomyces cultures to which 

steering reagents were added such as sulfites or alkalis. Through the formation of 
an inactive complex, the steering agents block or inhibit the conversion of 
acetaldehyde to ethanol; thus, excess reducing equivalents (NADH) are available 
to or "steered" towards dihydroxyacetone phosphate (DHAP) for reduction to 
30 produce glycerol. This method is limited by the partial inhibition of yeast growth 
that is due to the sulfites. This limitation can be partially overcome by the use of 
alkalis which create excess NADH equivalents by a different mechanism. In this 
practice, the alkalis initiated a Cannizarro disproportionate to yield ethanol and 
acetic acid from two equivalents of acetaldehyde. 
35 The gene encoding glycerol-3-phosphate dehydrogenase (DAR1 ,GPD1 ) 

has been cloned and sequenced from Saccharomyces diasiaticus (Wang et aL, 
(1994), J. BacL 1 76:7091-7095). The DAR1 gene was cloned into a shuttle vector 
and used to transform E. coli where expression produced active enzyme. Wang et 
- aL, supra, recognizes that DAR1 is regulated by the cellular osmotic environment 
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but does not suggest how the gene might be used to enhance glycerol production ■ 
in a recombinant organism. 

Other glycerol-3 -phosphate dehydrogenase enzymes have been isolated. 
For example, sn-glycerol-3-phosphale dehydrogenase has been cloned and 
5 sequenced from S. cerevisiae (Larason et aL, ( 1993) Mol. Microbiol., 10: 1 101, - 
(1993)). Albertyn et al. s (1994) Mol Cell 5/o/., 14:4135) teach the cloning of 
GPE>1 encoding a g]ycerol-3-phosphate dehydrogenase from S. cerevisiae. Like - 
Wang et al. } both Albertyn et al., and Larason et al. recognize the osmo-sensitvity 
of the regulation of this gene but do not suggest how the gene might be used in the 

10 production of glycerol in a recombinant organism. 

As with G3DPR glycerol-3 -phosphatase has been isolated from 
Saccharomyces cerevisiae and the protein identified as being encoded by the 
GPP1 and GPP2 genes (Norbeck et al., (1996) J. Biol. Chem., 271:13875). Like 
the genes encoding G3DPH, it appears thai GPP2 is osmotically-induced. 

15 There is no known art that teaches glycerol production from recombinant 

organisms with G3PDH/G3P phosphatase expressed together or separately. Nor 
is there known art that teaches glycerol production front any wild-type organism . 
with these two enzyme activities thai does not require applying some stress (salt 
or an osmolyte) to the cell. Eustace « 1987), Can. J. Microbiol., 33:1 12-117)) 

20 teaches away from achieving glycerol production by recombinant DNA 

techniques. By selective breeding techniques, these investigators created a 
hybridized yeast strain that produced glycerol at greater levels than the parent 
strains; however, the G3PDH activity remained constant or slightly lower. 
A microorganism capable of producing glycerol under physiological 

25 conditions is industrially desirable, especially when the glycerol itself will be used 
as a substrate in vivo as part of a more complex catabolk or biosynthetic pathway 
thai could be perturbed by osmotic stress or the addition of steering agents. 

The problem to be solved, therefore, is how4o direct carbon flux towards 
glycerol production by the addition or enhancement of certain enzyme activities, 

30 especially G3PDH and G3P phosphatase which respectively catalyze the 

conversion of dihydroxyacetone phosphate (DHAP) to glycerol-3-phosphate 
(G3P) and then to glycerol. This process has not previously been described for a 
recombinant organism and required the isolation of genes encoding the^two 
enzymes and their subsequent expression. A surprising and unanticipated 

35 difficulty encountered was the toxicity of G3P phosphatase to the host which 
required careful control of its expression levels to avoid growth inhibition. 
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; . SUMMARY OF THE INVENTION 
The present invention provides a method for the production of-glycerql 
from a recombinant organism comprising: (i) transforming a suitable host cell 
with an expression cassette comprising either or both 
5 ■ (a) a gene encoding a glycerol-3-phosphate dehydrogenase^ 

enzyme; 

(b) a gene encoding a gIycerol-3-phosphate phosphatase enzyme; . 
(ii) culturing the transformed host cell in the presence of at least one carbon 
source selected from the group consisting of monosaccharides, oligosaccharides, 
1 0 polysaccharides, and single-carbon substrates, or mixtures thereof whereby, 
glycerol is produced; and (Hi) recovering the glycerol. Glucose is the most 
preferred carbon source. 

The invention further provides transformed host cells comprising 
expression cassettes capable of expressing glycerol-3-phosphate dehydrogenase 
1 5 and glycerol-3 -phosphatase activities for the production of glycerol. 

BRIEF DESCRIPTION OF BIOLOGICAL 
DEPOSITS AND SEQUENCE LISTING 
Applicants have made the following biological deposits under the terms of 
the Budapest Treaty on the International Recognition of the Deposit of 
20 Micro-organisms for the Purposes of Patent Procedure: 

Depositor Identification Int'l. Depository 
Reference Designation Date of Deposit 



Escherichia coli pAH21/DH5a ATCC 98187 26 September 1996 

(containing the GPP2 gene) 

Escherichia coli (pDARl A/AA200) ATCC 98248 6 November 1 996 
(containing the DARI gene) 

"ATCC" refers to the American Type Culture Collection international 
depository located at 12301 Parklawn Drive, Rockville, MD 20852 U.S.A. The 
25 designation is the accession number of the deposited material. 

Applicants have provided 23 sequences in conformity with the Rules for 
the Standard Representation of Nucleotide and Amino Acid Sequences in Patent 
Applications (Annexes 1 and II to the Decision of the President of the EPO* 
published in Supplement No. 2 to OJ EPO, 12/1992) and with 37 C.FJL 
30 1 .821-1.825 and Appendices A and B (Requirements for Application Disclosures 
Containing Nucleotides and/or Amino Acid Sequences). 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention provides a method for the biological production of 
glycerol from a fermentable carbon source in a recombinant organism. The 
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method provides a rapid, inexpensive and environmentally-responsible source of 
glycerol useful in the cosmetics'and pharmaceutical industries. The method uses i 
microorganism containing cloned homologous or heterologous genes encoding 
glycerol-3-phosphate dehydrogenase (G3PDH) and/or glycerol-3 -phosphatase 
5 (G3P phosphatase). The microorganism is contacted with a carbon source and 
glycerol is isolated from the conditioned media. The genes may be incorporated 
into the host microorganism separately or together for the production of glycerol. 

As used herein the following terms may be used for interpretation of tbe 
claims and specification. 
10 ^ e terms "glycerol-3^phosphate dehydrogenase" and "G3PDH" refer to a 

polypeptide responsible for an enzyme activity thai catalyzes the conversion of 
dihydroxyacetone phosphate (DHAP) to glycerol-3-phosphate (G3P). In vivo 
G3PDH may be NADH; NADPH; or FAD-dependenL The NADH-dependent 
enzyme (EC 1.1.1.8) is encoded by several genes including GPD 1 (GenBank 
15 Z74071x2), or GPD2 (GenBank ZS51 69x1), or GPD3 (GenBank G984182), of 
DAR1 (GenBank Z74071x2). Tbe NADPH-dependent enzyme (EC 1.1.1.94) k 
encoded by gpsA (GenBank U321643, (cds 19791 1-196892) G466746 and 
L45246). The FAD-dependent enzyme (EC 1.1.99.5) is encoded by GUT2 
(GenBank Z47047x23), or glpD (GenBank Gl 47838), or glpABC (GenBank 
20 M20938). 

The terms M gIycerol-3-phosphatase w , 4, sn-glycerol-3-phosphatase" or 
"d,l-gIycerol phosphatase", and 4 H33P phosphatase" refer to a polypeptide 
responsible for an enzyme activity that catalyzes tbe conversion of glycerol-3- 
phosphate to glycerol. G3P phosphatase is encoded by GPP1 (GenBank 

25 Z47047xl25), or GPP2 (GenBank U 188 13x1 1). 

The term "glycerol kinase" refers to a polypeptide responsible for an 
enzyme activity that catalyzes the conversion of glycerol to glycerol-3 -phosphate, 
or glycerol-3-phosphale to glycerol, depending on reaction conditions. Glycerol 
kinase is encoded by GUT1 ^GenBank Ul 1583x19). 

30 The terms "GPDl", "DAR1", "OSG1", "D2830". and "YDL022W" will 

be used interchangeably and refer to a gene that encodes a cytosolic glycerol-3- 
phosphate dehydrogenase and is characterized by the base sequence given as 
SEQBDNO:!. 

The term "GPD2" refers to a gene that encodes a cytosolic glycerol-3- 
35 phosphate dehydrogenase and is characterized by the base sequence given in 
SEQIDNO:2. 

The terms **GUT2" and "YH155C" are used interchangeably and refer to a 
gene that encodes a mitochondrial glycerol-3-phosphate dehydrogenase and is 
characterized by the base sequence given in SEQ ID NO:3 . 
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the terms "GPPr, "RHR2" and "YIL053W" are used interchangeably 
and refer to a gene that encodes a cytosolic glycerol-3-phosphatase and is 
characterized by the base sequence given in SEQ ID Np:4. 

The terms "GPP2", "HOR2" and "YER062C" are used interchangeably 
5 and refer to a gene thai encodes a cytosolic glycerol-3 -phosphatase and is 
characterized by the base sequence given as SEQ ID NO:5. 

The term "GUT1 " refers to a gene that encodes a cytosolic glycerol kinase 
and is characterized by the base sequence given as SEQ ID NO:6. 

As used herein, the terms "function" and "enzyme function" refer to the 
1 0 catalytic activity of an enzyme in altering the energy required to perform a 
specific chemical reaction. Such an activity may apply to a reaction in 
equilibrium where the production of both product and substrate may be 
accomplished under suitable conditions. 

The terms "polypeptide" and "protein" are used herein interchangeably. 
15 " The terms "carbon substrate" and "carbon soiree" refer to a carbon source 

capable of being metabolized by host organisms of the present invention and 
particularly mean carbon sources selected from the group consisting of mono- 
saccharides, oligosaccharides, polysaccharides, and one-carbon substrates or 
mixtures thereof. 

20 The terms "host cell" and "host organism" refer to a microorganism 

capable of receiving foreign or heterologous genes and expressing those genes to 
produce an active gene product 

The terms "foreign gene", "foreign DNA", "heterologous gene", and 
"heterologous DNA" all refer to genetic material native to one organism that has 
25 been placed within a different host organism. 

The terms "recombinant organism" and "transformed host" refer to any 
organism transformed with heterologous or foreign genes. The recombinant 
organisms of the present invention express foreign genes encoding G3PDH and 
G3P phosphatase for the production of glycerol from suitable carbon substrates. 
30 "Gene" refers to a nucleic acid fragment that expresses a specific protein, 

including regulatory sequences preceding (5' non-coding) and following <3' non- 
coding) the coding region. The terms "native" and "wild-type" gene refer to the 
gene as .found in nature with its own regulatory sequences. 

As used herein, the terms "encoding" and "coding" refer to the process by 
35 which a gene, through the mechanisms of transcript^ and translation, produces 
an amino acid sequence. The process of encoding a specific amino acid sequence 
is meant to include DNA sequences that may involve base changes that do not 
cause a change in the encoded amino acid* or which involve base changes which 
may alter one or more amino acids, but do hot affect the functional properties of 
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the protein encoded by the DNA sequence. Therefore, the invention encompasses 
more than the-specific exemplary sequences. Modifications to the sequence, such 
as deletions, insertions, or substitutions in the sequence which produce silent 
changes thai do notsubstantially affect the functional properties of the resulting 
5 protein molecule are also contemplated. For example, alterations in the gene 
sequence which reflect the degeneracy of the genetic code, or which result in the 
production of a chemically equivalent amino acid ata given site, are 
contemplated; thus, a codon for the amino acid alanine, a hydrophobic amino acid, 
may be substituted by a codon encoding another less hydrophobic residue, such as 

1 0 glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. 
Similarly, changes which result in substitution of one negatively charged residue 
for another, such as aspartic acid for glutamic acid, or one positively charged 
residue for another, such as lysine for arginine, can also be expected to produce a 
biologically equivalent product. Nucleotide changes which result in alteration of 

15 the N-terminal and C-terminal portions of the protein molecule would also not be 
expected to alter the activity of the protein. In some cases, it may in fact be 
desirable to make mutants of the sequence in order to study the effect of alteration 
on the biological activity of the protein. Each of the proposed modifications is 
well within the routine skill in the art, as is determination of retention of 

20 biological activity in the encoded products. Moreover, the skilled artisan 

recognizes that sequences encompassed by this invention are also defined by their 
ability to hybridize, under stringent conditions (0. IX SSC, 0.1% SDS, 65 °C), 
with the sequences exemplified herein. 

The term "expression" refers to the transcription and translation to gene 

25 product from a gene coding for the sequence of the gene product. 

The terms "plasmid", "vector", and "cassette" as used herein refer to an 
extra chromosomal element often carrying genes which are not part of the central 
metabolism of the cell and usually in the form of circular double-stranded DNA 
molecules. Such elements may be autonomously replicating sequences, genome 

30 integrating sequences, phage or nucleotide sequences, linear or circular, of a 
single- or double-stranded DNA or RNA, derived from any source, in which a 
number of nucleotide sequences have been joined or recombined into a unique 
construction which is capable of introducing a promoter fragment and DNA 
sequence for a selected gene product along with appropriate 3' untranslated 

35 sequence into a cell. "Transformation cassette" refers to a specific vector 

containing a foreign gene and having elements in addition to the foreign gene that 
facilitate transformation of a particular host cell. "Expression cassette" refers to a 
specific vector containing a foreign gene and having elements in addition to the 
foreign gene that allow for enhanced expression of that gene in a foreign host 

6 
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The terms 'transformation" and * i t^ansfecti^n ,, refer to the acquisition of 
new genes in a cell after the incorporation of nucleic acid The acquired genes 
may be integrated into chromosomal DNA or introduced as extrachromosomal 
replicating sequences. The term "frausformant" refers to the cell resulting from a 
- 5 transformation. 

The term "genetically altered" refers to the process of changing hereditary 
material by transformation or mutation. 

Representative enzvme pathway s - 
It is contemplated that glycerol may be produced in recombinant 
1 0 organisms by the manipulation of the glycerol biosynthetic pathway found in most 
microorganisms. Typically, a carbon substrate such as glucose is converted to 
glucose-6-phosphate via hexokinase in the presence of ATP. Glucose-phosphate 
isomerase catalyzes the conversion of glucose-6-phosphate to fructose-6- 
phosphate and then to fructose- 1 ,6-diphosphate through the action of 
1 5 6-phosphofructokinase. The diphosphate is then taken to dihydroxyacetone 
phosphate (DHAP) via aldolase. Finally NADH-dependent G3PDH converts 
DHAP to glycerol-3-phosphate which is then dephosphorylated to glycerol by 
G3P phosphatase. (Agarwal (1990), Adv. Biockem. Engrg. 41:1 14). 
Alternate pathways for glycerol production 
20 An alternative pathway for glycerol production from DHAP has been 

suggested (Wang et aL, (1994) J. Bacf. 176:7091-7095). In this proposed pathway 
DHAP could be dephosphorylated by a specific or non-specific phosphatase to 
give dihydroxyacetone, which could then be reduced to glycerol by a dihydroxy- 
acetone reductase. Dihydroxyacetone reductase is known in prokaryotes and in 
25 Schizosacckaromyces pombe, and cloning and expression of such activities 
together with an appropriate phosphatase could lead to glycerol production. 
Another alternative pathway for glycerol production from DHAP has been 
suggested (Redkar (1995), Experimental Mycology, 19:241, 1995). In this 
pathway DHAP is isomerized to glyceraldehyde-3-phosphale by the common 
30 glycolytic enzyme triose phosphate isomerase. Glyceraldehyde-3-phosphate is 
dephosphorylated to glyceraldehyde, which is then reduced by alcohol 
dehydrogenase or a NADP-dependent glycerol dehydrogenase activity. The 
cloning and expression of the phosphatase and dehydrogenase activities from 
Aspergillus nidulans could lead to glycerol production. 
35 Genes encoding G3PDH and G3P phosphatase 

The present invention provides genes suitable for the expression of 
G3PDH and G3P phosphatase activities in a host cell. 

Genes encoding G3PDH are known. For example, GPD1 has been 
isolated from Saccharomyces and has the base sequence given by SEQ ID NO:l> 
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encoding the amino acid sequence given in SEQ ID NO:7 (Wang et al. 5 sizpra). 
"' Similarly, G3PDH activity has also been isolated fronf Saccharomyces encoded 
by GPD2 having the base sequence given in SEQ ID N0:2 encoding the amino 
acid sequence given in SEQ ID NQ:8 (Eriksson et at., (1 995) Mol Microbioh, 
5 17:95). | 

For the purposes of the present invention it is contemplated that any gene 
encoding a polypeptide responsible for G3PDH activity is suitable wherein that 
activity is capable of catalyzing the conversion of dihydroxyacetone phosphate 
(DHAP) to glycerol-3-phosphate (G3P). Further, it is contemplated that any gene 

10 encoding the amino acid sequence of G3PDH as given by SEQ ID NOS:7, 8, 9, 
10, 1 1 and 12 corresponding to the genes GPD1, GPD2, GUT2, gpsA, glpD, and 
the a subunit of glpABC respectively, will be functional in the present invention 
wherein that amino acid sequence may encompass amino acid substitutions, 
- deletions or additions thai do not alter the function of the enzyme. The skilled. 

15 person will appreciate that genes encoding G3PDH isolated from other sources 
will also be suitable for use in the present invention. For example, genes isolated 
from prokaryotes include GenBank accessions M34393. M20938, L06231, 
U12567, L45246, L45323, L45324, L45325,U32164, U32689, and U39682. 
Genes isolated from fungi include GenBank accessions U30625. U30876 and 

20 X561 62; genes isolated from insects include GenBank accessions X61223 and 
XI 4 179; and genes isolated from mammalian sources include GenBank 
accessions U12424, M25558 and X78593. 

Genes encoding G3P phosphatase are known. For example, GPP2 has 
been isolated from Saccharomyces cerevisiae and has the base sequence given by 

25 SEQ ID NO:5, which encodes the amino acid sequence given in SEQ ID NO: 13 
(Norbeck et al., (1996), J. Biol Chem., 271:13*75). 

For the purposes of the present invention, any gene encoding a G3P 
phosphatase activity is suitable for use in the method wherein that activity is 
capable of catalyzing the conversion of .glycerol-3 -phosphate to glycerol. Further, 

30 any gene encoding the amino acid sequence of <i3P phosphatase as gi ven by 

SEQ ID NOS : 1 3 and 1 4 corresponding to the genes GPP2 and GPP 1 respectively, 
will be functional in the present invention including any amino acid sequence that 
encompasses amino acid substitutions, deletions or additions that do not alter the 
function of the G3P phosphatase enzyme. The skilled person will appreciate that 

35 genes encoding G3P phosphatase isolated from other sources will also be suitable 
for use in the present invention. For example, the dephosphorylation of glycerol- 
3-phosphate to yield glycerol may be achieved with one or more of the following 
general or specific phosphatases: alkaline phosphatase (EC 3. 1 .3. 1 ) {GenBank 
Ml 91 59, M29663, U02550 or M33965]; acid phosphatase (EC 3.1.3.2) [GenBank 
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U51210, U19789, U28658 or L20566]; glycerol-3-pbosphatase (EC 3.1.3.-) 
[GenBank Z3 8060 or U 1881 3x1 l];glu^ . - 

[GenBank M33807]; gIucose-6-phosphatase (EC 3.1 .3.9) [GenBank U00445]; 
fructose- 1 ,6-bisphosphatase (EC 3.1.3.1 1) [GenBank X12545 or J03207] or 
5 phosphotidyl glycero phosphate phosphatase (EC 3.1.3.27) [GenBank M23546 ~ 
and M23628]. 

Genes encoding glycerol kinase are known. For example, GUT! encoding 
the glycerol kinase from Saccharomyces has been isolated and sequenced (Favlik 
et al. (1993), Curr. Genet, 24:21) and the base sequence is given by 

1 0 SEQ ID NO:6, which encodes the amino acid sequence given in SEQ ID NO:15. 
The skilled artisan will appreciate thai, although glycerol kinase catalyzes the 
degradation of glycerol in nature, the same enzyme will be able to function in the 
synthesis of glycerol, converting glycerol-3-phosphate to glycerol under the 
appropriate reaction energy conditions. Evidence exists for glycerol production 

1 5 through a glycerol kinase. Under anaerobic or respiration^inhibited conditions* 
Trypanosoma brucei gives rise to glycerol in the presence of Glycerol-3-P and 
ADP. The reaction occurs in the glycosome compartment (Hammond, (1 985), 
J. Biol. Ghent, 260:15646-15654). 
Host cells 

20 Suitable host cells for the recombinant production of glycerol by the 

expression of G3PDH and G3P phosphatase may be either prokaryotic or 
eukaryotic and will be limited only by their ability to express active enzymes. 
Preferred host cells will be those bacteria, yeasts, and filamentous fungi typically 
useful for the production of glycerol such as Citrobacter, ErUerobacter, 

25 Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, Saccharomyces. 
Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, 
Salmonella, Bacillus, Streptomyces and Pseudomonas. Preferred in the present 
invention are £. coli and Saccharomyces. 

30 Vectors and expression cassettes 

The present invention provides a variety of vectors and transformation and 
expression cassettes suitable for the cloning, transformation and expression of 
G3PDH and G3P phosphatase into a suitable host cell. Suitable vectors will be 
those which are compatible with the bacterium employed. Suitable vectors can be 

35 derived, for example, from a bacteria, a virus (such as bacteriophage T7 or a M-l 3 
derived phage), a cosmid, a yeast or a plant Protocols for obtaining and using 
such vectors are known to those in the art (Sambrook et al., Molecular Cloning: A 
Laboratory Manual - volumes 1, 2, 3 (Cold Spring Harbor Laboratory: Cold 
Spring Harbor, NY, 1989)). 
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Typically, the vector or cassette contains sequences directing transcription 
and translation of the appropriate, gene, a selectable marker, and sequences 
allowing autonomous replication or chromosomal integration. Suitable vectors 
comprise a region 5' of the gene which harbors transcriptional initiation controls 
5 and a region 3' of the DNA fragment which controls transcriptional termination. It 
is most preferred when both control regions are derived from genes homologous 
to the transformed host celt Such control regions need not be derived from the 
genes native to the specific species chosen as a production host 

Initiation control regions, or promoters, which are useful to drive 

1 0 expression of the G3PDH and G3P phosphatase genes in the desired host cell are 
numerous and familiar to those skilled in the art. Virtually any promoter capable 
of driving these genes is suitable for the present invention including but not 
limited to CYC1, H1S3, GAL1, GAL10, ADH1, PGK, PH05, GAPDH, ADC1, 
TRP1, URA3, LEU2, ENO, and TPI (useful for expression in Saccharomyces)\ 

1 5 AOX1 (useful for expression in Pichia); and lac, tip, XP^ 7JP R , T7, tac, and tre. 
(useful for expression in K coli). 

Termination control regions may also be derived from various genes native 
to the preferred hosts. Optionally, a termination site may be unnecessary; 
however, it is most preferred if included. 

20 For effective expression of the instant enzymes, DNA encoding the 

enzymes are linked operably through initiation codons to selected expression 
control regions such that expression results in the formation of the appropriate 
messenger RNA. 

Transformation of suitable hosts and expression of G3PDH and G3P phosphatase 
25 for the production of glycerol 

Once suitable cassettes are constructed they are used to transform 
appropriate host cells. Introduction of the cassette containing the genes encoding 
G3PDH and/or G3P phosphatase into the host cell may be accomplished by 
known procedures such as by transformation, e.g., using calcium-permeabilized 
30 cells, electroporation, or by transfection using a recombinant phage virus 
(Sambrook et aL, supra). 

In the present invention AH21 and DAR1 cassettes were used to transform 
the E. coli DH5a as fully described in the GENERAL METHODS and 
EXAMPLES. 
35 Media and Carbon Substrates 

Fermentation media in the present invention must contain suitable carbon 
substrates. Suitable substrates may include but are not limited to monosaccharides 
such as glucose and fructose, oligosaccharides such as lactose or sucrose, 
. polysaccharides such as starch or cellulose or mixtures thereof and unpurified 

10 
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mixtures from renewable feedstocks such as cheese whey permeate, cornsteep 
liquor, sugar beet molasses, and barley malt Additionally, the carbon substrate 
may also be one-carbon substrates such as carbon dioxide, or methanol for which 
metabolic conversion into key biochemical intermediates has been demonstrated. 
5 Glycerol production from single, carbon sources (e.g., methanol, 

formaldehyde or formate) has been reported in methyl otrophic yeasts (Y amada . 
et al. (1 989), Agric. Biol Chem., 53(2):54 1-543) and in bacteria (Hunter et al. 
(1985), Biochemistry, 24:4148-4155). These organisms can assimilate single 
carbon compounds, ranging in oxidation state from methane to formate, and 

10 produce glycerol. The pathway of carbon assimilation can be through ribulose 
monophosphate, through serine, or through xylulose-monophosphate (Gottschalk, 
Bacteria] Metabolism. Second Edition, Springer- Veriag: New York (1986)). The 
ribulose monophosphate pathway involves the condensation of formate with 
ribulose-5-phosphate to form a 6 carbon sugar that becomes fructose and 

15 eventually the three carbon product, glyceraIdehyde-3-phosphate. Likewise, the 
serine pathway assimilates the one-carbon compound into the glycolytic pathway 
via methylenetetrahydrofolate. 

In addition to one and two carbon substrates, methylotrophic organisms 
are also known to utilize a number of other carbon-containing compounds such as 

20 methylamine, glucosamine and a variety of amino acids for metabolic activity. 
For example, methylotrophic yeast are known to utilize the carbon from 
methylamine to form trehalose or glycerol (Bellion et al. (1993), Microb. Growth 
CI Compd, Pnt Symp.], 7th, 415-32. Editors): Murrell, J. Collin; Kelly, 
Don P. Publisher. Intercept, Andover, UK). Similarly, various species of 

25 Candida will metabolize alanine or oleic acid (Suiter et al. ( 1 990), Arch. 

Microbiol., 153(5):485-9). Hence, the source of carbon utilized in the present 
invention may encompass a wide variety of carbon-containing substrates and will 
only be limited by the choice of organism^ 

Although all of the above mentioned carbon substrates and mixtures 

30 thereof are suitable in the present invention, preferred carbon substrates are 

monosaccharides, oligosaccharides, polysaccharides, single-carbon substrates or 
mixtures thereof. More preferred are sugars such as glucose, fructose, sucrose, 
maltose, lactose and single carbon substrates such as methanol and carbon 
dioxide. Most preferred as a carbon substrate is glucose. 

35 In addition to an appropriate carbon source, fermentation media must 

contain suitable minerals, salts, cofactore, buffers and other components, known to 
those skilled in the art, suitable for the growth of the cultures and promotion of the 
enzymatic pathway necessary for glycerol production. 
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Culture Conditions 

Typically cells are grown at 30 °C in -appropriate media. Preferred growth 
media are common commercially prepared media such as Luria Bertani (LB) 

- broth, Sabourand Dextrose (SD) broth, or Yeast medium (YM) broth. Other 

5 defined or synthetic growth media may also be used and the appropriate medium 
for growth of the particular microorganism will be known by one skilled in the art 
of microbiology or fermentation science. The use of agents known to modulate 
catabolite repression directly or indirectly, *.g., cyclic adenosine 2 , :3 , -mom>- 
phosphate, may also be incorporated into the reaction media. Similarly, the use of 

10 - agents known to modulate enzymatic activities (e.g., sulfites, bisulfites, and 
alkalis) that lead to enhancement of glycerol production may be used in 
conjunction with or as an alternative to genetic manipulations. 

Suitable pH ranges for the fermentation are between pH 5.0 to pH 9.0 

- where the range of pH 6.0 to pH 8.0 is preferred for the initial condition. 

15 Reactions may be performed under aerobic or anaerobic conditions where 

anaerobic or micro aerobic conditions are preferred. 
Identification and purification of G3 PDH and G3P phosphatase 

The levels of expression of the proteins G3PDH and G3P phosphatase are 
measured by enzyme assays. G3PDH activity assay relies on the spectral 

20 properties of the cosubstrate, NADH, in the DHAP conversion to G-3-P. NADH 
has intrinsic UV/vis absorption and its consumption can be monitored spectro- 
photometrically at 340 nm. G3P phosphatase activity can be measured by any 
method of measuring the inorganic phosphate liberated in the reaction. The most 
commonly used detection method uses the visible spectroscopic determination of 

25 a blue-colored phosphomolybdate ammonium complex. 
Identification and recover y of glycerol 

Glycerol may be identified and quantified by high performance liquid 
chromatography (HPLC) and gas chromatography/mass spectroscopy (GC/MS) 
analyses on the cell-free extracts. Preferred is a method where the fermentation 

30 media are analyzed on an analytical ion exchange column using a mobile phase of 
0.0 IN sulfuric acid in an isocratic fashion. 

Methods for the recovery of glycerol from fermentation media are known 
in the art For example, glycerol can be obtained from cell media by subjecting 
the reaction mixture to the following sequence of steps: filtration; water removal; 

35 organic solvent extraction; and fractional distillation (U.S. Patent No. 2,986,495). 
Selection of transformants bv com plementation 

In the absence of a functional gp^4 -encoded G3PDH, E. coli cells are 
unable to synthesize G3P, a condition which leads to a block in membrane 
biosynthesis. Cells with such a block are auxotrophic, requiring that either 
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glycerol or G3P be present in the culture media for synthesis of membrane 
phospholipids. 

A cloned heterologous wild-type gpsA gene is able to complement the 
chromosomal gpsA mutation to allow growth in media lacking glycerol or G3P 
5 (Wang, et al. (1 994), J. BacL 1 76:709 1 -7095). Based on this complementation 
strategy, growth of gpsA -defective cells on glucose would only occur if they 
possessed a plasmid-encoded gpsA, allowing a selection based on synthesis of 
G3JP from DHAP. Cells which lose the recombinant gpsA plasmid during culture 
would fail to synthesize G3P and cell growth would subsequently be inhibited 

1 0 The complementing G3PDH activity can be expressed not only from gpsA, but 
also from other cloned genes expressing G3PDH activity such as GPD1, GPD2, 
GPD3, GUT2, glpD, and glpABC. These can be maintained in a gpsA -defective 
E. coli strain such as BB20 (Cronan et al. (1974), J. Bad., 118:598), alleviating 
the need to use antibiotic selection and its prohibitive cost in large-scale 

15 fermentations. 

A related strategy can be used for expression and selection in 
osmoregulatory mutants of S. cerevisiae (Larsson et al. (1993), Moi Microbiol.. 
10:1101-1111). These osg 1 mutants are unable to grow at low water potential and 
show a decreased capacity for glycerol production and reduced G3PDH activity. 

20 The osg! salt sensitivity defect can be complemented by a cloned and expressed 
G3PDH gene. Thus, the ability to synthesize glycerol can be used simultaneously 
as a selection marker for the desired glycerol -producing cells. 

EXAMPLES 

GENERAL METHODS 

25 Procedures for phosphorylations, ligations, and transformations are well . 

known in the art Techniques suitable for use in the following examples may be 
found in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor Laboratory Press (1989). 

Materials and methods suitable for the maintenance and growth of 

30 bacterial cultures are well known in the art. Techniques suitable for use in the 

following examples may be found in Manual of Methods for General Bacteriology 
(Phillipp Gerhardt R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis 
A. Wood, Noel R. Krieg and G. Briggs Phillips, eds), American Society for 
Microbiology, Washington, DC. (1994) or in Biotechnology: A Textbook of 

3 5 Industrial Microbiology (Thomas D. Brock, Second Edition ( 1 989) Sinauer 

Associates, Inc., Sunderland, MA). All reagents and materials used for the growth 
and maintenance of bacterial cells were obtained from Aldrich Chemicals 
(Milwaukee, WI), DIFCO Laboratories (Detroit, MI), GIBCO/BRL (Gaithersburg, 
MD), or Sigma Chemical Company (St Louis, MO) unless otherwise specified. 

.13 
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The meaning of abbreviations is as follows: "fa" means hour<s), "min" 
means minutes), "sec" means seconds), "d" means day(s), "mL" means 
milliliters, "L" means liters. '_■ 

r.ft11 strains . ... . . 

5 The following Escherichia coli strains were used for transformation and 

expression of G3PDH and G3P phosphatase. Strains were obtained from the 
£ coli Genetic Stock Center or from Life Technologies, Gaithersburg, MD); 

AA200 {garBW flmA22 ompF627 fadLWl relAl pit- JO spoTl tpi-1 phoMSlO 
10 mcrBl) (Anderson et al., (1970), J. Gen. Microbiol, 62:329). 

BB20 (tonA22 AphoAS JadL701 relAJ gl P R2 gl P D3 pit-}0 gpsA20 spoTl T2R) 
(Cronan et al., J. Bad., 1 18:598). 

15 DH5a (deoR endAl gyrA96 hsdRJ7 recAl relAl supE44 thi-1 AflocZYA- 
argFVJ69) phi80lacZmJ5 T") (Woodcock et al., (1989), Nucl. Acids Res., 

17:3469). 

IHpntificatioT i nf Glycerol ^ 
20 The conversion of glucose to glycerol was monitored by HPLC and/orGC. 

Analyses were performed using standard techniques and materials available to one 
of skill in the art of chromatography. One suitable method utilized a Waters 
Maxima 820 HPLC system using UV (210 nm) and RI detection. Samples were 
injected onto a Shodex SH-101 1 column (8 mm x 300 mm; Waters, Milford, MA) 
25 equipped with a Shodex SH-101 IP precolumn (6 mm xSO mm), temperature- 
controlled at 50 °C, usingO Ol N H 2 S0 4 as mobile phase at a flow rate of 
0 5 mL/min. When quantitative analysis was desired, samples were injected onto 
a Shodex SH- 101 1 column (8 mm x 300 mm; Waters, Milford, MA) equipped 
with a Shodex SH-101 IP precolumn (6 mm x 50 mm), temperature-controlled at 
30 50 »C using 0.01 N H 2 S0 4 as mobile phase at a flow rate of 0.69 mL/min. When 
quantitative analysis was desired, samples were prepared with a known amount of 
trimethylacenc acid as anextemal standard. Typically, the retention times of 
glycerol (RI detection) and glucose (RI detection) were 17.03 min and 12.66 mm, 

respectively. ... 
35 Glycerol was also analyzed by GC/MS. Gas chromatography with mass 

spectrometry detection for and quantitation of glycerol was done using a 

DB-WAX column (30 m, 0.32 mm I.D., 0.25 urn film thickness, J & W Science, 
Folsom, CA), at the foUowing conditions: injector: spUt, 1:15; sample volume: 
1 uL; temperature profile: 150 "C intitial temperature with 30 sec hold, 40 °C/min 
40 to 180 °C, 20 °C/min to 240 C C, hold for 2.5 min. Detection: EI Mass 
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Spectrometry (Hewlett Packard 5971, San Fernando, CA), quantitative SIM using 
ions. 61 m/z and 64 m/z as target ions for glycerol and glycerol-d8, and ion 43 ra/z 
as qualifier ion for glycerol. Glycerol-d8 was used as an internal standard. 
Assay for glycerols-phosp hatase. GPP 
5 The assay for enzyme actiyjity was performed by incubating the extract 

with an organic phosphate substrate in a bis-Tris or MES and magnesium buffer, 
pH 6.5^ The substrate used was either I -a -glycerol phosphate, or d,l-a-glycerol 
phosphate. The final concentrations of the reagents in the assay are: buffer 
(20 mM, bis-Tris or 50 mM MES); MgCl 2 (10 mM); and substrate (20 mM). If 

10 the total protein in the sample was low and no visible precipitation occurs with an 
acid quench, the sample was conveniently assayed in the cuvette. This method 
involved incubating an enzyme sample in a cuvette that contained 20 mM 
substrate (50 \iL f 200 mM), 50 mM MES, 10 mM MgCl 2 , pH 6.5 buffer. The 
final phosphatase assay volume was 0.5 mL. The enzyme-containing sample was 

1 5 added to the reaction mixture; the contents of the cuvette were mixed and then the 
cuvette was placed in a circulating water bath at T = 37 °C for 5 to 120 min, the 
length of time depending on whether the phosphatase activity in the enzyme 
sample ranged from 2 to 0.02 U/mL. The enzymatic reaction was quenched by 
the addition of the acid molybdate reagent (0.4 mL). After the Fiske SubbaRow 

20 reagent (0.1 mL) and distilled water (1.5 mL) were added, the solution was mixed 
and allowed to develop. After 10 min, to allow full color development, the 
absorbance of the samples was read at 660 nm using a Cary 219 UV/Vis 
spectrophotometer. The amount of inorganic phosphate released was compared to 
a standard curve that was prepared by using a stock inorganic phosphate solution 

25 (0.65- mM) and preparing 6 standards with final inorganic phosphate 
concentrations ranging from 0.026 to 0.130 |j.mol/mL. 

Spectrophotometry Assav for Glycerol 3-Phosphate Dehydrogenase (G3PDH) 
Activity 

The following procedure was used as modified below from a method 
30 published by Bell et al. (1975), J. Biol Chem., 250:7153-8. This method involved 
incubating an enzyme sample in a cuvette that contained 0.2 mM NADH; 2.0 mM 
Dihydroxyacetone phosphate (DHAP), and enzyme in 0.1 M Tris/HCl, pH 7.5 
buffer with 5 mM DTT.in a total volume of 1 :0 mL at 30 C C. The 
spectrophotometer was set to monitor absorbance changes at the fixed wavelength 
35 of 340 nm. The instrument was blanked on a cuvette containing buffer only. 

After the enzyme was added to the cuvette, an absorbance reading was taken. The 
first substrate, NADH (50 uL 4 mM NADH; absorbance should increase approx 
1.25 AU), was added to determine the background rate. The rale should be 
followed for at least 3 min. The second substrate, DHAP (50 uL 40 mM DHAP), 
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was then added and the absorbance change over time was monitored for at least 
3 min to determine to determine the gross rate. G3PDH activity was defined by 
subtracting the background rate from the gross rate. 
PLASMID CONSTRUCTION AND STRAIN CONSTRUCTION 
5 Cloning an d expression of glycerol 3-phosphatase for increase of glycerol 
production in E. coli 

The Saccharomyces cerevisiae chromosomeV lamda clone 6592 ( Gene 
Bank, accession # Ul 881 3x1 1) was obtained from ATCC The glycerol 
3-pbosphate phosphatase (GPP2) gene was cloned by cloning from the lamda 
1 0 clone as target DNA using synthetic primers (SEQ ID NO: 1 6 with 

SEQ ID NO: 17) incorporating an BamHI-RBS-Xbal site at the 5' end and a Smal 
site at the 3- end. The product was subcloned into pCR-Script (Stratagene, 
Madison, WI) at the Srfl site to generate the plasmids pAHIS containing GPP2. 
The plasmid pAHl 5 contains the GPP2 gene in the inactive orientation for 
1 5 expression from the lac promoter in pCR- Script SK+. The BamHI-Smal fragment 
from pAHl 5 containing the GPP2 gene was inserted into pBIueScriptll SK+ to 
generate plasmid pAH19. The pAH19 contains theGPP2 gene in the correct 
orientation for expression from the lac promoter. The Xbal-PstI fragment from 
pAH19 containing the GPP2 gene was inserted into pPH0X2 to create plasmid 
20 pAH21 . The pAH21/ DH5a is the expression plasmid 
Plasmids for the over-expression of PARI in E. coli 

DAR1 was isolated by PCR cloning from genomic 5*. cerevisiae DNA 
using synthetic primers (SEQ ID NO: 18 with SEQ ID NO: 19). Successful PCR 
cloning places an Ncol site at the 5* end of DAR1 where the ATG within Ncol is 
25 the DAR1 initiator methionine. At the 3' end of DAR1 a BamM site is introduced 
following the translation terminator. The PCR fragments were digested with Ncol 
+ BamHI and cloned into the same sites within the expression plasmid pTrc99A 
(Pharmacia, Piscataway, N J) to give pD ARIA. 

In order to create a better ribosome binding site at the 5* end of DAR1, an 
30 Spel-RB S-Ncol linker obtained by annealing synthetic primers (SEQ ID NO:20 
with SEQ ID NO:21) was inserted into the Ncol site of pDARl A to create 
pAH40. Plasmid pAH40 contains the new RBS and DAR1 gene in the correct 
orientation for expression from the trc promoter of pTrc99A (Pharmacia, 
Piscataway, NJ). The NcoI-BamHI fragement from pDARl A and an second set 
35 of Spel-RBS-Ncol linker obtained by annealing synthetic primers (SEQ ID NO:22 
with SEQ ID NO:23) was inserted into the Spel-BamHI site of pBC-SK+ 
(Stratagene ; Madison, WI) to create plasmid pAH42. The plasmid pAH42 
contains a chloramphenicol resistant gene. 
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Construction of expression cassettes for PARI and GPP2 

Expression cassettes for DAR1 and GPP2 were assembled from the 
individual DAR1 and GPP2 subclones described above using standard molecular 
biology methods. The BamHl-Pstl fragment from pAHl 9 containing the 
5 ribosomal binding site (RBS) and GPP2 gene was inserted into pAH40 to create 
pAH43. The BamHI-Pst] fragment from pAH19 containing the RBS and GPP2 
gene was inserted into pAH42 to create pAH45. 

The ribosome binding site at the 5' end of GPP2 was modified as follows. 
A BamHI-RBS-Spel linker, obtained by annealing synthetic primers 
1 0 GATCCAGGAAACAGA (SEQ ID MO:24) with CTAGTCTGTTTCCTG (SEQ 
ID NO:25) to the Xbal-Pst] fragment from pAHl9 containing the GPP2 gene, was 
inserted into the BamHI-PstI site of pAH40 to create pAH48. Plasmid pAH48 
contains the DAR1 gene, the modified RBS, and the GPP2 gene in the correct 
orientation for expression from the trc promoter of pTrc99A (Pharmacia, 
15 Piscataway, NJ). 

Transformation of E. coli 

All the plasmids described here were transformed into E. coli DH5a using 
standard molecular biology techniques. The transformants were verified by its 
DN A RFLP pattern. 
20 EXAMPLE I 

PRODUCTION O F GLYCEROL FROM E COLI 
TRANSFORME D WTTH G3PDH GENE 

Media 

Synthetic media was used for anaerobic or aerobic production of glycerol 
25 using E. coli cells transformed with pDARl A. The media contained per liter 6.0 g 
Na 2 HP04, 3.0 g KH 2 P0 4 , 1 .0 g NH4CI, 0.5 g NaCl, 1 mL 20% MgS0 4 .7H 2 0, 
8.0 g glucose, 40 mg casamino acids, 0.5 ml 1% thiamine hydrochloride, 100 mg 
ampicilliiL 
Growth Conditions 

30 Strain AA200 harboring pDARl A or the pTrc99A vector was grown in 

aerobic conditions in 50 mL of media shaking at 250 rpm in 250 mL flasks at 
37 °C. At A^qo 0.2-0.3 isopropylthio-p-D-galactoside was added to a final 
concentration of 1 mM and incubation continued for 48 h. For anaerobic growth 
samples of induced cells were used to fill Falcon #2054 tubes which were capped 

35 and gently mixed by rotation at 37 °C for 48 h. Glycerol production was 
determined by HPLC analysis of the culture supernatants. Strain 
pDARl A/AA200 produced 0.38 g/L glycerol after 48 b under anaerobic 
conditions, and 0.48 g/L under aerobic conditions. 
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EXAMPLE 2 
PRODUCTION OF GLYCEROL FROM F. CPU 
TRANSFORM ED WITH G3P PHOSPHATASE GENE (GVPO) 

Media 

Synthetic phoA media was used in shake flasks to demonstrate the 
increase of glyceol by GPP2 expression in £. colt. The phoA medium contained 
per liter: Amisoy, 12 g; ammonium sulfate, 0.62 g; MOPS, 10,5 g; Na-citrate; ." 
1 .2 g; NaOH (1 M), 10 mL; 1 M MgS0 4 , 12 mL; 1 00X trace elements, 12 mL; 
50% glucose, 1 0 mL; 1 % thiamine, 3 0 mL; 1 00 mg/mL L-proline, 10 mL; 
2.5 mM FeCl 3 , 5 mL; mixed phosphates buffer, 2 mL (5 mL 0.2 M hJaH 2 P0 4 + 
9 mL 0.2 M K 2 HP0 4 ), and pH to 7.0. The 1 00X traces elements for phoA 
medium IL contained: ZnS0 4 . 7 H 2 0, 0.58 g; MnS0 4 . HjO, 0.34 g; CuS0 4 .5 
H 2 0, 0.49 g; CoCl 2 .6 H 2 6, 0.47 g; H 3 B0 3 , 0.12 g, NaMo0 4 .2 H 2 O,0.48 g- 
Shake Flasks Experiments 
1 5 The strains pAH2 1 /DH5a (containing GPP2 gene) and pPHOX2/DH5a 

(control) were grown in 45 mL of media (phoA media, 50 ug/mL carbemcillin, 
and 1 ug/mL vitamin B 12 ) in a 250 ml shake flask at 37. °C. The cultures were 
grown under aerobic condition (250 rpm shaking) for 24 h. Glycerol production 
was determined by HPLC analysis of the culture supernatant pAH21/DH5a 
20 produced 0.2 g/L glycerol after 24 h. 

EXAMPLE 3 
Productio n of glycerol from D-glucose using 
recombina nt E coli containing both GPP2 and DABf ; 
Growth for demonstration of increased glycerol production by £ coli 
25 DH5a-containing pAH43 proceeds aerobically at 37 °C in shake-flask cultures 
(erlenmeyer flasks, liquid volume l/5th of total volume). 

Cultures in minimal media/1 % glucose shake-flasks are started by 
inoculation from overnight LB/1% glucose culture with antibiotic selection. 
Minimal media are: filter-sterilized defined media, final pH 6.8 (HCI), contained 
30 perliter 12.6 g (NH^SO* 13.7 € K 2 HPO 4 ,0.2 g yeast extract (Difco), l M 

NaHC0 3 , 5 mg vitamin B, 2 , 5 mL Modified Balch's Trace-Element Solution (the 
composition of which can be found in Methods for General and Mnl^.la,. 
Bacteriology (P. Gerhardt et al., eds, p. 158, American Society for Microbiology, 
Washington, DC (1 994)). The shake-flasks are incubated at 37 °C with vigorous' 
35 shaking for overnight, after which they are sampled for GC analysis of the 

supernatant The pAH43/DH5a showed glycerol production of 3. 8 .g/L after 24 h. 
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EXAMPLE 4 
Production of glycerol from D-glucose using 
recombinant £ coli containing Both GPP2 and PARI 
Example 4 illustrates the production of glucose from the recombinant 
5 £. coli DH5a/pAH48 9 containing both the GPP2 and DAJtl genes. 

The strain DH5a/pAH48 was constructed as described above in the 
GENERAL METHODS. . - _ 

Pre-Cultiire 

DH5cx/pAH48 were pre-cultured for seeding into a fermentation run. 
1 0 Components and protocols for the pre-culture are listed below. 

Pre-Culture Media 
KH 2 P0 4 30.0 g/L 

Citric acid - 2.0 g/L 

MgS0 4 7H 2 0 2.0 g/L 

15 98% H 2 S0 4 2.0mL/L 

Ferric ammonium citrate - ■- 0.3 g/L 

CaCl 2 -2H 2 0 0.2 g/L 

Yeast extract 5.0 g/L 

Trace metals 5.0 mL/L 

20 Glucose 10.0 g/L 

Carbeniciilin 100.0 mg/L 

The above media components were mixed together and the pH adjusted to 
6.8 with NH4OH. The media was then filter sterilized. 

Trace metals were used according to the following recipe: 



Citric acid, monohydrate 


4.0 g/L 


MgS0 4 -7H 2 0 


3.0 g/L 


MnS04H 2 0 


0.5 g/L 


NaCl 


1.0 g/L 


FeS04-7H 2 0 


O.l g/L 


CoC!2-6H 2 0 


0.1 g/L 


CaCI 2 


0.1 g/L 


ZnS0 4 -7H 2 0 


0.1 g/L 


CuS0 4 -5H 2 0 


10 mg/L 


AIKtS0 4 )2l2H 2 0 


10 mg/L 


H3BO3 


10 mg/L 


Na 2 Mo0 4 -2H 2 0 


lOmg/L 


NiS04-6H 2 0 


10 mg/L 


Na 2 Se0 3 


10 mg/L 


Na 2 W0 4 -2H 2 0 


10 mg/L 
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Cultures were started from seed culture inoculated from 50 jiL frozen 
stock (15% glycerol as cryoprotectant) to 600 mL medium in a 2-L Erlenmeyer 
flask. Cultures were grown al 30 °C in a shaker at 250 rpm for approximately 
1; 72 h and then used to seed the fermenter. 
5 Fermentation growth 
. Vessel 

15-L stirred tank fermenter " v 

Medium 

KH 2 P0 4 6 .8g/L 
10 Citric acid 2:0 g/L 

MgS0 4 7H 2 0 2.0 g/L 

98%H 2 S0 4 2.0mL/L 
Ferric ammonium citrate 0.3 g/L 

CaCl 2 -2H20 0.2 g/L 

15 Mazu DF204 antifoam I.OmL/L 

The above components were sterilized together in the fermenter vessel. 
The pH was raised to 6.7 with NH4OH. Yeast extract (5 g/L) and trace metals 
solution (5 mL/L) were added aseptically from filter sterilized stock solutions. 
Glucose was added from 60% feed to give final concentration of 10 g/L. 
20 Carbenicillin was added at 1 00 mg/L. Volume after inoculation was 6 L. 
Environmental Conditions For Fermentation 

The temperature was controlled at 36 °C and the air flow rate was 
controlled at 6 standard liters per minute. Back pressure was controlled at 0.5 bar. 
The agitator was set at 350 rpm. Aqueous ammonia wais used to control pH at 6 .7. 
25 The glucose feed (60% glucose monohydrate) rate was controlled to maintain 
excess glucose. 
. Results 

The results of the fermentation run are *gi ven in Table 1 . 
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Table 1 

EFT OD550 [Glucose] [Glycerol] Total Glucose Total Glycerol 

(bxi (AU) (t/L) (e/U Fed(R) Produced (r) 

0 0.8 9.3 25 

6 4-7- - 4.0 2.0 . .49 14 

8 5.4 0 3.6 71 25 

10 6.7 0.0 4.7 . 116 33 

12 7.4 2.1 7.0 157 > 49 

14.2 10.4 0.3 100 230 70 

162 18.1 9.7 15.5 259 106 

18.2 12,4 14.5 305 

20.2 11.8 17.4 17.7 353 119 

222 11.0 12.6 382 

24.2 10.8 6.5 26.6 404 .178 

26.2 10.9 6.8 442 

282 lO* 10.3 31.5 463 216 

30.2 10.2 13.1 30.4 493 213 

32.2 10.1 8.1 28.2 512 196 

34.2 10.2 3.5 33.4 530 223 

36.2 10.1 5.8 548 

382 9.8 5.1 36.1 512 233 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 

. (i) APPLICANT: 

(A) NAME: E. II. DU PONT DE NEMOURS AND. COMPANY 

(B) STREET : 1007 MARKET STREET 

(C) - CITY: WILMINGTON 
AD) STATE: DELAWARE 
(E) COUNTRY: U.S.A. 

m POSTAL CODE (ZIP): 19898 

(G) TELEPHONE: 302-892-8112 

(H) TELEFAX: 302-773-01*4 

(I) TELEX: €717325 

(A) ADDRESSEE: GENENCOR INTERNATIONAL, INC. 

(B) STREET: 4 CAMBRIDGE PLACE 

1870 SOUTH WINTON ROAD 

(C) CITY: ROCHESTER 

(D) STATE : NEW YORK 

(E) COUNTRY: U.S.A. .*."... 

(F) - POSTAL CODE (ZIP) : 14618 

(ii) TITLE OF INVENTION: METHOD FOR THE PRODUCTION OF 

GLYCEROL BY RECOMBINANT 
ORGANISMS 

<iii>- NUMBER OF SEQUENCES: 25 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: DISKETTE, 3.5 INCH 

(B) COMPUTER: IBM PC COMPATIBLE 

(C) OPERATING SYSTEM: MICROSOFT WORD FOR WINDOWS 95 

(D) SOFTWARE: MICROSOFT WORD VERSION 7 .OA 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: <60/03602 

(B) FILING DATE: NOVEMBER 13, 1996 

(C) CLASSIFICATION: 

(vii) - ATTORNEY /AGENT INFORMATION: 

(A) NAME: FLOYD, LINDA AXAMETHY 

(B) REGISTRATION NUMBER: 33, 692 

(C) REFERENCE / DOCKET NUMBER: CR-9981-P1 
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■ (2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1380 base pairs 

(B) TYPE: nucleic acid 
.(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
- (xi>. - SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



CTTTAATTTT 


CTTTTATCTT 


ACTCTCCTAC 


ATAAGACATC AAGAAACAAT 


TGTATATTGT 


60 


ACACCCCCCC 


CCTCCACAAA 


CACAAATATT 


GATAATATAA AGATGTCTGC 


TGCTGCTGAT 


120 


AGATTAAACT 


TAACTTCCGG 


CCACTTGAAT 


GCTGGTAGAA AGAGAAGTTC 


CTCTTCTGTT 


180 


TCTTTGAAGG 


CTGCCGAAAA 


GCCTTTCAAG 


GTTACTGTGA TTGGATCTGG 


TAACTGGGGT 


240 


ACTACTATTG 


CCAAGGTGGT 


TGCCGAAAAT 


TGTAAGGGAT ACCCAGAAGT TTTCGCTCCA 


300 


ATAGTACAAA 


TGTGGGTGTT 


CGAAGAAGAG 


ATCAATGGTG AAAAATTGAC 


TGAAATCATA 


360 


AATACTAGAC 


ATCAAAACGT 


GAAATACTTG 


CCTGGCATCA CTCTACCCGA CAATTTGGTT 


420 


GCTAATCCAG 


ACTTGATTGA 


TTCAGTCAAG 


GATGTCGACA TCATCGTTTT 


CAACATTCCA 


480 


CATCAATTTT 


TGCCCCGTAT 


CTG TAG CCAA 


TTGAAAGGTC ATGTTGATTC 


ACACGTCAGA 


540 


GCTATCTCCT 


GTCTAAAGGG 


TTTTGAAGTT 


GGTGCTAAAG GTGTCCAATT 


GCTATCCTCT 


600 


TACATCACTG 


AGGAACTAGG 


TATTCAATGT 


GGTGCTCTAT CTGGTGCTAA 


CATTGCCACC 


660 


GAAGTCGCTC 


AAGAACACTG 


GTCTGAAACA 


ACAGTTGCTT ACCACATTCC 


AAAGGATTTC 


720 


AGAGGCGAGG 


GCAAGGACGT 


CGACCATAAG 


GTTCTAAAGG CCTTGTTCCA 


CAGACCTTAC . 


780 


TTCCACGTTA 


GTGTCATCGA 


AGATGTTGCT 


GGTATCTCCA TCTGTGGTGC 


TTTGAAGAAC 


840 


GTTGTTGCCT 


TAGGTTGTGG 


TTTCGTCGAA 


GGTCTAGGCT GGGGTAACAA 


CGCTTCTGCT 


900 


GCCATCCAAA 


GAGTCGGTTT 


GGGTGAGATC 


ATCAGATTCG /GTCAAATGTT 


TTTCCCAGAA 


9*0 


TCTAGAGAAG 


AAACATACTA 


CCAAGAGTCT 


GCTGGTGTTG CTGATTTGAT 


CACCACCTGC 


1020 


GCTGGTGGTA 


GAAACGTCAA 


GGTTGCTAGG 


CTAATGGCTA CTTCTGGTAA GGACGCCTGG 


1080 


GAATGTGAAA 


AGGAGTTGTT 


GAATGGCCAA 


TCCGCTCAAG GT-TTAATTAC 


CTGCAAAGAA 


1140 


GTTCACGAAT 


GGTTGGAAAC 


ATGTGGCTCT 


GTCGAAGACT TCCCATTATT 


TGAAGCCGTA 


1200 


TACCAAATCG 


TTTACAACAA 


CTACCCAATG 


AAGAACCTGC CGGACATGAT TGAAGAATTA 


1260 


GATCTACATG 


AAGATTAGAT 


TTATTGGAGA 


AAGATAACAT ATCATACTTC 


CCCCACTTTT 


1320 


TTCGAGGCTC 


TTCTATATCA 


TATTCATAAA 


TTAGCATTAT GTCATTTCTC 


ATAACTACTT 


1380 
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<2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: ~ 

(A) LENGTH: 2946 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY : linear 

_(ii) MOLECULE TYPE: DMA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

GAATTCGAGC CTGAAGTGCT GATTACCTTC AGGTAGACTT CATCTTGACC CATCAACCCC 60 

AGCGTCAATC CTGCAAATAC ACCACCCAGC AGCACTAGGA TGATAGAGAT AATATAGTAC 120 

GTGGTAACGC TTGCCTCATC ACCTACGCTA TGGCCGGAAT CGGCAACATC CCTAGAATTG 180 

AGTACGTGTG ATCCGGATAA CAACGGCAGT GAATATATCT TCGGTATCGT AAAGATGTGA 240 

TATAAGATGA TGTATACCCA ATGAGGAGCG CCTGATCGTG ACCTAGACCT TAGTGGCAAA 300 

AACGACATAT CTATTATAGT GGGGAGAGTT TCGTGCAAAT AACAGACGCA GCAGCAAGTA 360 

ACTGTGACGA TATCAACTCT TTTTTTATTA TGTAATAAGC AAACAAGCAC GAATGGGGAA 420 
AGCCTATGTG CAATCACCAA GGTCGTCCCT TTTTTCCCAT TTGCTAATTT AGAATTTAAA ■ 4 80 

GAAACCAAAA GAATGAAGAA AGAAAACAAA TACTAGCCCT AACCCTGACT TCGTTTCTAT 540 

GATAATACCC TGCTTTAATG AACGGTATGC CCTAGGGTAT ATCTCACTCT GTACGTTACA 600 

AACTCCGGTT ATTTTATCGG AACATCCGAG CACCCGCGCC TTCCTCAACC CAGGCACCGC 660 

CCCAGGTAAC CGTGCGCGAT GAGCTAATCC TGAGCCATCA CCCACCCCAC CCGTTGATGA 120 

CAGCAATTCG GGAGGGCGAA AATAAAACTG GAGCAAGGAA TTACCATCAC CGTCACCATC 780 

ACCATCATAT CGCCTTAGCC TCTAGCCATA GCCATCATGC AAGCGTGTAT CTTCTAAGAT 840 

TCAGTCATCA TCATTACCGA GTTTGTTTTC CTTCACATGA TGAAGAAGGT TTGAGTATGC 900 

TCGAAACAAT AAGACGACGA TGGCTCTGCC ATTGGTTATA TTACGCTTTT GCGGCGAGGT 960 

GCCGATGGGT TGCTGAGGGG AAGAGTGTTT AGCTTACGGA CCTATTGCCA TTGTTATTCC 1020 

GATTAATCTA * TTGTTCAGCA GCTCTTCTCT ACCCTGTCAT TCTAGTATTT TTTTTTTTTT 1080 

TTTTTGGTTT TACTTTTTTT TCTTCTTGCC TTTTTTTCTT GTTACTTTTT TTCTAGTTTT 1140 

TTTTCCTTCC ACTAAGCTTT TTCCTTGATT TATCCTTGGG . TTCTTCTTTC TACTCCTTTA 1200 

GATTTTTTTT. TTATATATTA ATTTTTAAGT TTATGTATTT TGGTAGATTC AATTCTCTTT 1260 

CCCTTTCCTT TTCCTTCGCT CCCCTTCCTT ATCAATGCTT GCTGTCAGAA GATTAACAAG 1320 

ATACACATTC CTTAAGCGAA CGCATCCGGT GTTATATACT CGTCGTGCAT ATAAAATTTT 1380 
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GCCTTCAAGA TCTACTTTCC TAAGAAGATC ATTATTACAA ACACAACTGC ACTCAAAGAT 1440 
GACTGCTCAT ACTAATATCA AACAGCACAA- ACACTGTCAT GAGGACCATC CTATCAGAAG 1500 
ATCGGACTCT GCCGTGTCAA TTGTACATTT GAAACGTGCG CCCTTCAAGG TTACAGTGAT 1560 
TGGTTCTGGT AACTGGGGGA CCACCATCGC CAAAGTCATT GCGGAAAACA CAGAATTGCA 1.620 ; 
TTCCCATATC TTCGAGCCAG AGGTGAGAAT GTGGGTTTTT GATGAAAAGA TCGGCGACGA 1680 
AAATCTGACG GATATCATAA ATACAAGACA CCAGAACGTT AAATATCTAC CCAATATTGA 1740 
CCTGCCCCAT AATCTAGTGG CCGATCCTGA TCTTTTACAC TCCATCAAGG GTGCTGACAT 1800 
CCTTGTTTTC AACATCCCTC ATCAATTTTT ACCAAACATA GTCAAACAAT TGCAAGGCCA 18*0 
CGTGGCCCCT CATGTAAGGG CCATCTCGTG TCTAAAAGGG TTCGAGTTGG GCTCCAAGGG 1920 
TGTGCAATTG CTATCCTCCT ATGTTACTGA TGAGTTAGGA ATCCAATGTG GCGCACTATC 1980 
TGGTGCAAAC TTGGCACCGG AAGTGGCCAA GGAGCATTGG TCCGAAACCA CCGTGGCTTA 2040- 
CCAACTACCA AAGGATTATC AAGGTGATGG CAAGGATGTA GATCATAAGA TTTTGAAATT 2100 
GCTGTTCCAC AGACCTTACT TCCACGTCAA TGTCATCGAT GATGTTGCTG GTATATCCAT 2160 
TGCCGGTGCC TTGAAGAACG TCGTGGCACT TGCATGTGGT TTCGTAGAAG GTATGGGATG 2220 
GGGTAACAAT GCCTCCGCAG CCATTCAAAG GCTGGGTTTA GGTGAAATTA TCAAGTTCGG 2280 
TAGAATGTTT TTCCCAGAAT CCAAAGTCGA GACCTACTAT CAAGAATCCG CTGGTGTTGC 2340 
AGATCTGATC ACCACCTGCT CAGGCGGTAG AAACGTCAAG GTTGCCACAT ACATGGCCAA 2400 
GACCGGTAAG TCAGCCTTGG AAGCAGAAAA GGAATTGCTT AACGGTCAAT CCGCCCAAGG 24 60 
GATAATCACA TGCAGAGAAG TTCACGAGTG GCTACAAACA TGTGAGTTGA CCCAAGAATT 2520 
CCCAATTATT CGAGGCAGTC TACCAGATAG TCTACAACAA CGTCCGCATG GAAGACCTAC 2580 
CGGAGATGAT TGAAGAGCTA GACATCGATG ACGAATAGAC ACTCTCCCCC CCCCTCCCCC 2640 
TCTGATCTTT CCTGTTGCCT CTTTTTCCCC CAACCAATTT ATCATTATAC ACAAGTTCTA 2700 
CAACTACTAC TAGTAACATT ACTACAGTTA TTATAATTTT CTATTCTCTT TTTCTTTAAG 2760 
AATCTATCAT TAACGTTAAT TTCTATATAT ACATAACTAC CATTATACAC GCTATTATCG 2820 
TTTACATATC ACATCACCGT TAATGAAAGA TACGACACCC TGTACACTAA CACAATTAAA 2880 
TAATCGCCAT AACCTTTTCT GTTATCTATA GCCCTTAAAG CTGTTTCTTC GAGCTTTTCA 294 0 

294 6 

CTGCAG 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : .- 3178 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESSr single 

(D) T0P0UX2Y: linear 

(ii) MOLECU1X- TYPE: DNA (genomic) 

i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



CTGCAGAACT 


TCGTCTGCTC 


TGTGCCCATC 


CTCGCGGTTA 


GAAAGAAGCT GAATTGTTTC 


60 


ATGCGCAAGG 


GCATCAGCGA 


GTGACCAATA 


ATCACTGCAC 


TAATTCCTTT TTAGCAACAC 


120 


ATACTTATAT 


ACAGCACCAG 


ACCTTATGTC 


TTTTCTCTGC 


TCCGATACGT TATCCCACCC 


180 


AACTTTTATT 


TCAGTTTTGG 


CAGGGGAAAT 


TTCACAACCC 


CGCACGCTAA AAATCGTATT 


240 


TAAACTTAAA AGAGAACAGC 


CACAAATAGG 


GAACTTTGGT 


CTAAACGAAG GACTCTCCCT 


300 


CCCTTATCTT 


GACCGTGCTA 


TTGCCATCAC 


TGCTACAAGA 


CTAAATACGT ACTAATATAT 


360 


GTTTTCGGTA ACGAGAAGAA 


GAGCTGCCGG 


TGCAGCTGCT 


GCCATGGCCA CAGCCACGGG 


'420. 


GACGCTGTAC 


TGGATGACTA 


GCCAAGGTGA 


TAGGCCGTTA 


GTGCACAATG ACCCGAGCTA 


480 


CATGGTGCAA 


TTCCCCACCG 


CCGCTCCACC 


GGCAGGTCTC 


TAGACGAGAC CTGCTGGACC 


^540 


GTCTGGACAA 


GACGCATCAA 


TTCGACGTGT 


TGATCATCGG 


TGGCGGGGCC ACGGGGACAG 


600 


GATGTGCCCT 


AGATGCTGCG 


ACCAGGGGAC 


TCAATGTGGC 


CCTTGTTGAA AAGGGGGATT 


660 


TTGCCTCGGG 


AACGTCGTCC 


AAATCTACCA 


AGATGATTCA CGGTGGGGTG CGGTACTTAG 


720 


AGAAGGCCTT 


CTGGGAGTTC 


TCCAAGGCAC 


AACTGGATCT 


GGTCATCGAG GCACTCAACG 


780 


AGCGTAAACA 


TCTTATCAAC 


ACTGCCCCTC 


ACCTGTGCAC 


GGTGCTACCA ATTCTGATCC 


840 


CCATCTACAG 


CACCTGGCAG 


GTCCCGTACA 


TCTATATGGG 


CTGTAAATTC TACGATTTCT 


900 


TTGGCGGTTC 


CCAAAACTTG 


AAAAAATCAT 


ACCTACTGTC 


CAAATCCGCC ACCGTGGAGA 


960 


AGGCTCCCAT 


•GCTTACCACA 


GACAATTTAA 


AGGCCTCGCT 


TGTGTACCAT GATGGGTCCT 


1020 


TTAACGACTC 


GCGTTTGAAC 


GCCACTTTAG 


CCATCACGGG 


TGTGGAGAAC GGCGCTACCG 


1080 


TCTTGATCTA 


TGTCGAGGTA 


CAAAAATTGA 


TCAAAGACCC AACTTCTGGT AAGGTTATCG 


1MO 


GTGCCGAGGC 


CCGGGACGTT 


GAGACTAATG 


AGCTTGTCAG AATCAACGCT AAATGTGTGG 


1200 


TCAATGCCAC 


GGGCCCATAC 


AGTGACGCCA 


TTTTGCAAAT 


GGACCGCAAC CCATCCGGTC 


1260 


TGCCGGACTC 


CCCGCTAAAC 


GACAACTCCA 


AGATCAAGTC 


GACTTTCAAT CAAATCTCCG 


1320 


TCATGGACCC 


GAAAATGGTC 


ATCCCATCTA 


TTGGCGTTCA 


CATCGTATTG CCCTCTTTTT 


1380 


ACTCCCCGAA GGATATGGGT 


TTGTTGGACG 


TCAGAACCTC 


TGATGGCAGA GTGATGTTCT 


14-4 0 


TTTTACCTTG 


GCAGGGCAAA 


GTCCTTGCCG 


GCACCACAGA CATCCCACTA AAGCAAGTCC 


1500 
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CAGAAAACCC 


TATGCCTACA 


rarrrTrnTA TTCAAGATAT CTTGAAAGAA CTACAGCACT 


1560 


ATATCGAATT 


CCCCGTGAAA 


ftrftrftnrAPG TGCTAAGTGC ATGGGCTGGT GTCAGACCTT 


1620 


TGGTCAGAGA 


TCCACGTACA 


nTpprrrmf: APGGGAAGAA GGGCTCTGCC ACTCAGGGCG 


1680 


TGGTAAGATC 


CCACTTCTTG 


n»»t»r , T\r , T»Pnr»r t ft.TTAft.Tf^CPPT AfiTTAPTATT GPAGGTGGTA 


17<0 


AATGGACTAC 


TTACAGACAA 


_ _ _ — «. m«* * * ft r* ft /—TOP ft PLZi.fc.P.TTGTP P.AAGTTGGCG 
ATGGCTGACjo A/iMUiio 1 tun Wvi/io i IwlV. ortrtw i 1 wv*« 


1800 


GATTCCACAA 


CCTGAAACCT 


rp/-T»r*T\ r*T\ r* r ft ^ft.Pft.Tft.TTU. A flPTTGPTGGT GPAGAAGAAT 


1860 


GGACGCAAAA 


CTATGTGGCT 


nuTttkipm/Trqtr ftftRftTTftPrfc TTT&TPliTrt ft ft ft HTflTPPA 


1920 


ACTACTTGGT 


TCAAAACTAC 


-~ r~,r**r>r^rT<m OnTT^T TV T^fTA TTP^f'P* ft ft TTT r P'T , r*ft ft ft /"I ft ft I* 

GGAACCCGTT CCTCTATCAT TxotbHAi J i 1 J u/iH/u*>iHJ 


x 7 0 V 


CCATGGAAAA 


TAAACTGCCT 


mmnmn/<IMTl TV f C+ r* f TV f* TV TV /"* ^* IV ft ft ft T ft ft f^CT ft ft 'Pf^T ft PTP T ft 

TTGTCCTTAG CCGACAAGGA AAATAAUbTA AJ tl/il»il»ifi 


twiv 


GCGAGGAGAA 


CAACTTGGTC 


^ «r>im»*%mr**km*L ^mmmr tv ^Tv »p tv Tv •P^T^Tv ^ ft ft TPPPTT" ft f — *!■ 

AATTTTGATA CTTTCAGATA TCCAJTUAuA A J CLjvj> j ijASai 


51 00 


TAAAGTATTC 


CATGCAGTAC 


GAATATTGTA GAACTCCCTT GGACTTCCTT TTAAGAAGAA 


^1 AO 
£ JL OU 


CAAGATTCGC 


CTTCTTGGAC 


GCCAAGGAAG CTTTGAATGG CGTGCATGCC ACCGTCAAAfc 




TTATGGGTGA 


TGAGTTCAAT 


TGGTCGGAGA AAAAGAGGCA GTGGGAACTT GAAAAAA^Jb 




TGAACTTCAT 


CCAAGGACGT 


TTCGGTGTCT AAATCGATCA TGATAGTTAA GGGTGACAAA 




GATAACATTC 


ACAAGAGTAA 


TAATAATGGT AATGATGATA ATAATAATAA TGATACs TAAI 


9 ^ 00 


AACAATAATA 


ATAATGGTGG 


TAATGGCAAT GAAATCGCTA TTATTACCTA TTTTCCTTAA 




TGGAAGAGTT 


AAAGTAAACT 


AAAAAAACTA CAAAAATATA TGAAGAAAAA AAAAAAAAvaA 


9 «»90 


GGTAATAGAC 


TCTACTACTA 


CAATTGATCT TCAAATTATG ACCTTCCTAG rGTTTAT/wl 




CTATTTCCAA 


TACATAATAT 


•«. m * m >> m*\ tv n»/^iv T»rt>o^»»r«^ /^Tft r" ft fTTP/* /~*T , T r rT , ft ftT*ZiT 

AATCTATATA ATCATTGCTG GTAGAtJiUv- \si 1 i inni/ii 


9 64 0 


CGTTTTAATT 


ATCCCCTTTA 


TCTCTAGTCT AGTTTTATCA TAAAATAJAw Aft/iL^iL. J /iM« 


2700 


TAATATTCTT 


CAAACGGTCC 


•n^f«fflo/<iiiTt«/« rr tv tv t »v r i\ »r ft TTTM'rr'Pcr ftftft.ft.ft.ft.ft.fi.AA 
TGGTGCATAC GCAATACATA li TAlbbibv, Hj\hjvvij\t\nn 


2760 


ATGGAAAATT 


TTGCTAGTCA 


_ « » _^--niir /*>TV *TTV TV TV TvOft ft T ft ^r* T ft /** ft ft TP^PTHPTT^ 

TAAACCCTTT CATAAAACAA TAL,V3l AQaAUM 1 U^v- 1 ml Jio 


9820 


AAATTTTCAA 


GTTTTTATCA 


GATCCATGTT TCCTATCTGC I I CsAuAHCC 1 Ufti tva 1 v~va« 


2880 


AATAGTACCA TTTAGAACGC 


CCAATATTCA CATTGTvsTTC AAWj JLJ HM 1J umuumo 1 \a 


2940 


ACGTGTAATG 


GCCATGATTA 


ATGTGCCTGT ATGGTTAACC ACTCCAAATA GCTTATATTT 


3000 


CATAGTGTCA 


TTGTTTTTCA 


ATATAATGTT TAGTATCAAT GGATATGTTA CGACGGTGTT 


3060 


ATTTTTCTTG 


GTCAAATCGT 


AATAAAATCT CGATAAATGG ATGACTAAGA TTTTTGGTAA 


3120 


AGTTACAAAA TTTATCGTTT 


TCACTGTTGT CAATTTTTTG TTCTTGTAAT CACTCGAG 


3178 
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<2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQOENCE CHARACTERISTICS: 
. (A) LENGTH: 816 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single ' 
(D) TOPOLOGY : linear 

• . (ii) MOLECULE TYPE: DNA (genomic)- 

(xi) SEQUENCE DESCRIPTION: SEQ ID" NO: 4': 

ATGAAACGTT TCAATGTTTT AAAATATATC AGAACAACAA AAGCAAATAT ACAAACCATC 60 

GCAATGCCTT TGACCACAAA ACCTTTATCT TTGAAAATCA ACGCCGCTCT ATTCGATGTT 120 

GACGGTACCA TCATCATCTC TCAACCAGCC ATTGCTGCTT TCTGGAGAGA TTTCGGTAAA 180 

GACAAGCCTT ACTTCGATGC CGAACACGTT ATTCACATCT CTCACGGTTG GAGAACTTAC 240 

GATGCCATTG CCAAGTTCGC TCCAGACTTT GCTGATGAAG AATACGTTAA CAAGCTAGAA 300 

GGTGAAATCC CAGAAAAGTA CGGTGAACAC TCCATCGAAG TTCCAGGTGC TGTCAAGTTG 360 

TGTAATGCTT TGAACGCCTT GCCAAAGGAA AAATGGGCTG TCGCCACCTC TGGTACCCGT 420 

GACATGGCCA AGAAATGGTT CGACATTTTG AAGATCAAGA GACCAGAATA CTTCATCACC 480 

GCCAATGATG TCAAGCAAGG TAAGCCTCAC CCAGAACCAT ACTTAAAGGG TAGAAACGGT 540 

TTGGGTTTCC CAATTAATGA ACAAGACCCA TCCAAATCTA AGGTTGTTGT CTTTGAAGAC 600 

GCACCAGCTG GTATTGCTGC TGGTAAGGCT GCTGGCTGTA AAATCGTTGG TATTGCTACC 660 

ACTTTCGATT TGGACTTCTT GAAGGAAAAG GGTTGTGACA TCATTGTCAA GAACCACGAA 720 

TCTATCAGAG TCGGTGAATA CAACGCTGAA ACCGATGAAG TCGAATTGAT CTTTGATGAC 780 

TACTTATACG CTAAGGATGA CTTGTTGAAA TGGTAA * 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 753 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single "~ 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATGGGATTGA CTACTAAACC TCTATCTTTG AAAGTTAACG CCGCTTTGTT CGACGTCGAC 60 
GGTACCATTA TCATCTCTCA ACCAGCCATT GCTGCATTCT GGAGGGATTT CGGTAAGGAC 120 
AAACCTTATT TCGATGCTGA ACACGTTATC CAAGTCTCGC ATGGTTGGAG AACGTTTGAT 180* 
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GCCATTGCTA AGTTCGCTCC AGACTTTGCC AATGAAGAGT ATGTTAACAA ATTAGAAGCT 240 

GAAATTCCGG TCAAGTACGG TGAAAAATCC ATTGAAGTCC CAGGTGCAGT TAAGCTGTGC 300 

AACGCTTTGA ACGCTCTACC AAAAGAGAAA TGGGCTGTGG CAACTTCCGG TACCCGTGAT 360 

ATGGCACAAA AATGGTTCGA GCATCTGGGA ATCAGGAGAC CAAAGTACTT CATTACCGCT ... 420 

AATGATGTCA AACAGGGTAA GCCTCATCCA GAACCATATC TGAAGGGCAG GAATGGCTTA 480 

GGATATCCGA TCAATGAGCA AGACCCT TCC AAATCTAAGG TAGTAGTATT TGAAGACGCT 540 

CCAGCAGGTA TTGCCGCCGG AAAAGCCGCC GGTTGTAAGA TCATTGGTAT TGCCACTACT 600 

TTCGACTTGG ACTTCCTAAA GGAAAAAGGC TGTGACATCA TTGTCAAAAA CCACGAATCC 660 

ATCAGAGTTG GCGGCTACAA TGCCGAAACA GACGAAGTTG AATTCATTTT TGACGACTAC 7.20 

TTATATGCTA AGGACGATCT GTTGAAATGG TAA ^53- 
(2) IN FORMAT I OK FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2520 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TGTATTGGCC ACGATAACCA CCCTTTGTAT ACTGTTTTTG TTTTTCACAT GGTAAATAAC 60 

GACTTTTATT AAACAACGTA TGTAAAAACA TAACAAGAAT CTACCCATAC AGGCCATTTC 120 

GTAATTCTTC TCTTCTAATT GGAGTAAAAC CATCAATTAA AGGGTGTGGA GTAGCAT AGT . 180 

GAGGGGCTGA CTGCATTGAC AAAAAAATTG AAAAAAAAAA AGGAAAAGGA AAGGAAAAAA 240 

AGACAGCCAA GACTTTTAGA ACGGATAAGG TGTAATAAAA TGTGGGGGGA TGCCTGTTCT 300 

CGAACCATAT AAAATATACC ATGTGGTTTG AGTTGTGGCC GGAACTATAC AAATAGTTAT 360 

ATGTTTCCCT CTCTCTTCCG ACTTGTAGTA TTCTCCAAAC GTTACATATT CCGATCAAGC 420 

CAGCGCCTTT ACACTAGTTT AAAACAAGAA CAGAGCCGTA TGTCCAAAAT AATGGAAGAT 4 80 

TTACGAAGTG ACTACGTCCC GCTTATCGCC AGTATTGATG TAGGAACGAC CTCATCCAGA 540 

TGCATTCTGT TCAACAGATG GGGCCAGGAC GTTTCAAAAC ACCAAATTGA ATATTCAACT 600 

TCAGCATCGA AGGGCAAGAT TGGGGTGTCT GGCCTAAGGA GACCCTCTAC AGCCCCAGCT 660 

CGTGAAACAC CAAACGCCGG TGACATCAAA ACCAGCGGAA AGCCCATCTT TTCTGCAGAA 720 

GGCTATGCCA TTCAAGAAAC CAAATTCCTA AAAATCGAGG AATTGGACTT GGACTTCCAT 780 
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AACGAACCCA CGTTGAAGTT CCCCAAACCG GGTTGGGTTG AGTGCCATCC GCAGAAATTA 840 
CTGGTGAACG TCGTCCAATG CCTTGCCTCA AGTTTGCTCT CTCTGCAGAC TATCAACAGC 900 
GAACGTGTAG CAAACGGTCT CCCACCTTAC AAGGTAATAT GCATGGGTAT AGCAAACATG - 9*0' 
AGAGAAACCA CAATTCTGTG GTCCCGCCGC ACAGGAAAAC CAATTGTTAA CTACGGTATT 1020 
GTTTGGAACG ACACCAGAAC GATCAAAATC GTTAGAGACA AATGGCAAAA CACTAGCGTC 1080 
GATAGGCAAC TGCAGCTTAG ACAGAAGACT GGATTGCCAT TGCTCTCCAC GTATTTCTCC 1U0 
TGTTCCAAGC TGCGCTGGTT CCTCGACAAT GAGCCTCTGT GTACCAAGGC GTATGAGGAG 1200 
AACGACCTGA TGTTCGGCAC TGTGGACACA TGGCTGATTT ACCAATTAAC TAAACAAAAG 1260 
GCGTTCGTTT CTGACGTAAC CAACGCTTCC. AGAACTGGAT TTATGAACCT CTCCACTTTA 1320 
AAGTACGACA ACGAGTTGCT GGAATTTTGG GGTATTGACA AGAACCTGAT TCACATGCCC 1380 
GAAATTGTGT CCTCATCTCA ATACTACGGT GACTTTGGCA TTCCTGATTG GATAATGGAA 1440 
AAGCTACACG ATTCGCCAAA AACAGTACTG CGAGATCTAG TCAAGAGAAA CCTGCCCATA 1500 
CAGGGCTGTC TGGGCGACCA AAGCGCATCC ATGGTGGGGC AACTCGCTTA CAAACCCGGT 1560 
GCTGCAAAAT GTACTTATGG TACCGGTTGC TTTTTACTGT ACAATACGGG GACCAAAAAA 1620 
TTGATCTCCC AACATGGCGC ACTGACGACT CTAGCATTTT GGTTCCCACA TTTGCAAGAG 1680 
TACGGTGGCC AAAAACCAGA ATTGAGCAAG CCACATTTTG CATTAGAGGG TTCCGTCGCT 1740 
GTGGCTGGTG CTGTGGTCCA ATGGCTACGT GATAATTTAC GATTGATCGA TAAATCAGAG 1800 
GATGTCGGAC CGATTGCATC TACGGTTCCT GATTCTGGTG GCGTAGTTTT CGTCCCCGCA -18*0 
TTTAGIGGCC TATTCGCTCC CTATTGGGAC CCAGATGCCA GAGCCACCAT AATGGGGATG 1920 
TCTCAATTCA CTACTGGCTC CCACATGGCC AGAGCTGCCG TGGAAGGTGT TTGCTTTCAA 1980 
GCCAGGGCTA TCTTGAAGGC AATGAGTTCT GACGCGTTTG GTGAAGGTTC CAAAGACAGG -2040 " 
GACTTTTTAG AGGAAATTTC CGACGTCACA TATGAAAAGT CGCCCCTGTC GGTTCTGGCA 2100 
GTGGATGGCG GGATGTCGAG GTCTAATGAA GTCATGCAAA TTCAAGCCGA TATCCTAGGT 2WK> 
CCCTGTGTCA AAGTCAGAAG GTCTCCGACA GCGGAATGTA CCGCATTGGG GGCAGCCATT 2220 
GCAGCCAATA TGGCTTTCAA GGATGTGAAC GAGCGCCCAT TATGGAAGGA CCTACACGAT -2280 
GTTAAGAAAT GGGTCTTTTA CAATGGAATG GAGAAAAACG AACAAATATC ACCAGAGGCT 2340 
CATCCAAACC TTAAGATATT CAGAAGTGAA TCCGACGATG CTGAAAGGAG AAAGCATTGG 2400 
AAGTATTGGG AAGTTGCCGT GGAAAGATCC AAAGGTTGGC TGAAGGACAT AGAAGGTGAA 24 60 
CACGAACAGG TTCTAGAAAA CTTCCAATAA CAACATAAAT AATTTCTATT AACAATGTAA 2520 
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(2) INFORMATION FOR SEQ ID NO: 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 391 amino acids 
. (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown • 

(D) TOPOLOGY: unknown - 
(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Ser Ala Ala Ala Asp Arg Leu Asn Leu Thr Ser Gly His Leu Asn 
1 5 10 15 

Ala Gly Arg Lys Arg Ser Ser Ser Ser Val Ser Leu Lys Ala Ala Glu 
20 ' 25 30 

Lys Pro Phe Lys Val Thr Val lie Gly Ser Gly Asn Trp Gly Thr Thr 
35 «0 45 

lie Ala Lys Val Val Ala Glu Asn Cys Lys Gly Tyr Pro Glu Val Phe 
50 55 .60 

Ala Pro He Val Gin Met Trp Val Phe Glu Glu Glu He Asn Gly Glu 
65 70 75 80 

Lys Leu Thr Glu He He Asn Thr Arg His Gin Asn Val Lys Tyr Leu 
65 90 95 

Pro Gly He Thr Leu Pro Asp Asn Leu Val Ala Asn Pro Asp Leu lie 
100 105 HO 

Asp Ser Val Lys Asp Val Asp He lie Val Phe Asn He Pro His Gin 
115 * 120 125 

Phe Leu Pro Arg He Cys Ser Gin Leu Lys Gly His Val Asp Ser His 
130 " 135 140 

Val Arg Ala He Ser Cys Leu Lys Gly Phe Glu Val Gly Ala Lys Gly 
145 150 1*5 160 

Val Gin Leu Leu Ser Ser Tyr lie Thr Glu Glu Leu Gly He Gin Cys 
165 170 175 

Gly Ala Leu Ser Gly Ala Asn lie Ala Thr Glu Val Ala Gin Glu His 
.180 185 190 

Trp Ser Glu Thr Thr Val Ala Tyr His lie Pro Lys Asp Phe Arg Gly 
195 200 205 

Glu Gly Lys Asp Val Asp His Lys Val Leu Lys. Ala Leu Phe His Arg 
210 215 220 

Pro Tyr Phe His Val Ser Val lie Glu Asp Val Ala Gly He Ser lie 
225 230 235 240 
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Cys Gly Ala Leu Lys Asn Val Val Ala Leu Gly Cys Gly Phe Val Glu . 

2«S 250 255 

Gly Leu Gly Trp Gly Asn Asn Ale Ser Ala Ala lie Gin Arg Val Gly 

260 265 - 270; 

\ ■ • 

Leu Gly Glu lie lie Arg Phe i Gly Gin Met Phe Phe Pro Glu Ser Arg 
275 '280 285 

Glu Glu Thr Tyr Tyr Gin Glu Ser Ala Gly Val Ala Asp Leu Ile-Thr 
290 295 300 

Thr Cys Ala Gly Gly Arg Asn Val Lys Val Ala Arg Leu Met Ala Thr 
305 310 315 320 

Ser Gly Lys Asp Ala Trp Glu Cys Glu Lys Glu Leu Leu Asn Gly Gin 
325 330 . 335 

Ser Ala Gin Gly Leu lie Thr Cys Lys Glu Val His Glu Trp Leu Glu 
. 340 34 5 350 

Thr Cys Gly Ser Val Glu Asp Phe Pro Leu Phe Glu Ala Val Tyr Gin 
355" " 360 365 

lie Val Tyr Asn Asn Tyr Pro Met Lys Asn Leu Pro Asp Met lie Glu 
370 375 380 

Glu Leu Asp Leu His Glu Asp 
385 390 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 amino acids 
..■(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Thr Ala His Thr Asn lie Lys Gin His Lys His Cys His Glu Asp 
1 5 10 15 

His Pro lie Arg Arg Ser Asp Ser Ala Val Ser lie Val His Leu Lys 

20 25 30-. 

Arg Ala Pro Phe Lys Val Thr Val He Gly Ser ^Gly Asn Trp Gly Thr 
35 40 45 

Thr lie Ala Lys Val He Ala Glu Asn Thr Glu Leu His Ser His He 
50 55 *0 

Phe Glu Pro Glu Val Arg Met Trp Val Phe Asp Glu Lys He Gly Asp 
65 70 75 80 
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Glu Asn Leu Thr Asp lie lie Asn Thr Arg His Gin Asn Val Lys Tyr 
85 90 95 

Leu Pro Asn He Asp Leu Pro His Asn Leu Val Ala Asp Pro Asp Leu 



100 



105 



110 



Leu His Ser lie Lys Gly Ala Asp H« Leu Val Phe Asn lie Pro His 
US 120 125 

Gin Phe Leu Pro Asn lie Val Lys Gin Leu- Gin Gly His Val Ala Pro 
130 135 140 

His Val Arg Ala lie Ser Cys Leu Lys Gly Phe Glu Leu Gly Ser Lys 
j«5 150 155 160 

Gly Val Gin Leu Leu Ser Ser Tyr Val Thr Asp Glu Leu Gly He Gin 
a65 1™ 175 

Cys Gly Ale Leu Ser Gly Ala Asn Leu Ala Pro Glu Val Ala Lys Glu 
180 185 190 

His Trp Ser Glu Thr Thr Val Ala Tyr Gin Leu Pro Lys Asp Tyr Gin 
195 200 205 

Gly Asp Gly Lys Asp Val Asp His Lys He Leu Lys Leu Leu Phe His 
210 ~ ' . 215 220 

Arc Pro Tyr Phe His Val Asn Val He Asp Asp Val Ala Gly He Ser 
225 . 230 235 240 

He Ala Gly Ala Leu Lys Asn Val Val Ala Leu Ala Cys Gly Phe Val 
245 250 255 

Glu Gly Met Gly Trp Gly Asn Asn Ala Ser Ala Ala He Gin Arg Leu 
260 265 270 

Gly Leu Gly Glu He He Lys Phe Gly Arg Met Phe Phe Pro Glu. Ser 
275 280 285 

Lys Val Glu Thr Tyr Tyr Gin Glu Ser Ala Gly Val Ala Asp Leu He 
290 295 300 

Thr Thr Cys Ser Gly Gly Arg Asn Val Lys Val Ala Thr Tyr Met Ala 
305 . 310 315. 320 

Lys Thr Gly Lys Ser Ala Leu Glu Ala Glu Lys Glu Leu Leu Asn Gly 
325 330 335 

Gin Ser Ala Gin Gly He He Thr Cys Arg Glu Val His Glu Trp Leu 
340 345 350 

Gin Thr Cys Glu Leu Thr Gin Glu Phe Pro He He Arg Gly Ser Leu 
355 360 365 

Pro Asp Ser Leu Gin Gin Arg Pro His Gly Arg Pro Thr Gly Asp Asp 
370 375 380 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 614 amino acids 

(B) TYPE: amino acid 

(CJ STRANDEDNESS: unknown 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Thr Arg Ala Thr Trp Cys Asn Ser Pro Pro Pro Leu His Arg Gin 
1 5 10 15 

Val Ser Arg Arg Asp Leu Leu Asp Arg Leu Asp Lys Thr His Gin Phe 
20 25 3 0 

Asp Val Leu lie lie Gly Gly Gly Ala Thr Gly Thr Gly Cys Ala Leu 
35 40 45 

Asp Ala Ala Thr Arg Gly Leu Asn Val Ala Leu Val Glu Lys Gly Asp 
50 55 60 " 

Phe Ala Ser Gly Thr Ser Ser Lys Ser Thr Lys Met lie His Gly Gly 
65 70 75 80 y 

Val Arg Tyr Leu Glu Lys Ala Phe Trp Glu Phe Ser Lys Ala Gin Leu 
85 90 95 

Asp Leu Val lie Glu Ala Leu Asn Glu Arg Lys His Leu He Asn Thr 
100 105 no 

Ala Pro His Leu Cys Thr Val Leu Pro He Leu He Pro He Tyr Ser 
115 120 125 

Thr Trp Gin Val Pro Tyr He Tyr Met Gly Cys Lys Phe Tyr Asp Phe 
i3 ° 135 140 

Phe Gly Gly Ser Gin Ash Leu Lys Lys Ser Tyr Leu Leu Ser Lys Ser 
145 1*° 155 leo 

Ala Thr Val Glu Lys Ala Pro Met Leu Thr Thr Asp Asn Leu Lys Ala 
1*5 170 175 

Ser Leu Val Tyr His Asp Gly Ser Phe Asn Asp Ser Arg Leu Asn Ala 
180 185 190 

Thr Leu Ala He Thr Gly Val <Glu Asn Gly Ala Thr Val Leu Ue Tyr 
195 200 205 

Val Glu Val Gin Lys Leu He Lys Asp Pro Thr Ser Gly Lys Val lie 
210 215 220 

Gly Ala Glu Ala Arg Asp Val Glu Thr Asn Glu LeuVal Arg He Asn 
225 230 235 * 2 40 
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Ala Lys Cys Val Val Asn Ala Thr Gly Pro Tyr Ser Asp Ala lie Leu 
245 250 255 

Gin Met Asp Arg Asn Pro Ser Gly Leu Pro Asp Ser Pro Leu Asn Asp 
260 , 265 270 

Asn Ser Lys lie Lys Ser Thr |Phe Asn Gin lie Ser Val Met Asp Pro 
275 280 285 

Lys Met Val lie Pro Ser lie Gly- Val His lie Val Leu Pro Ser Phe 
290 295 300 

Tvr Ser Pro Lys Asp Met Gly Leu Leu Asp Val Arg Thr Ser Asp Gly 
3S5 310 315 320 

Arc Val Met Phe Phe Leu Pro Trp Gin Gly Lys Val Leu Ale Gly Thr 
9 325 330 335 

Thr Asp lie Pro Leu Lys Gin Val Pro Glu Asn Pro Met Pro Thr Glu 
3 4 0 345' 350 

Ala Asp lie- Gin Asp lie Leu Lys Glu Leu Gin His Tyr lie Glu Phe 
355 360 365. 

Pro Val Lys Arg Glu Asp Val Leu Ser Ala Trp Ala Gly Val Arg Pro 
370 * 375 380 

Leu Val Arg Asp Pro Arg Thr lie Pro Ala Asp Gly Lys Lys Gly Ser 
385 390 395 400 

Ala Thr Gin Gly Val Val Arg Ser His Phe Leu Phe Thr Ser Asp Asn 
405 410 415 

Gly Leu lie Thr lie Ala Gly Gly Lys Trp Thr Thr Tyr Arg Gin Met 
y 420 «5 430 

Ala Glu Glu Thr Val Asp Lys Val Val Glu Val Gly Gly Phe His Asn 
435 440 445 

Leu Lys Pro Cys His Thr Arg Asp lie Lys Leu Ala Gly Ala Glu Glu 
450 455 460 

Trp Thr Gin Asn Tyr Val Ala Leu Leu Ala Gin Asn Tyr His Leu Ser 
465 470 475 480 

Ser Lys Met Ser Asn Tyr Leu Val Gin Asn Tyr Gly Thr Arg Ser Ser 
485 490 495 

lie lie Cys Glu Phe Phe Lys Glu Ser Met Glu Asn Lys Leu Pro Leu 
500 505 510 

Ser Leu Ala Asp Lys Glu Asn Asn Val lie Tyr Ser Ser Glu Glu Asn 
515 520 525 

Asn Leu Val Asn Phe Asp Thr Phe Arg Tyr Pro Phe Thr lie Gly Glu 
530 535 540 
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Leu Lys Tyr Ser Met Gin Tyr Glu Tyr Cys Arg Thr Pro Leu Asp Phe 
54 5 550 . 555 560 

Leu Leu Arg Arg Thr Arg Phe Ala Phe Leu Asp Ala Lys Glu Ala Leu 
565 570 575 - 

Asn Ala Val His Ala Thr Val Lys Val Met Gly Asp Glu Phe Asn Trp 
. 580 585 590 

Ser Glu Lys Lys Arg Gin Trp Glu Leu Glu Lys Thr Val Asn Phe He 
595 600 605 

Gin Gly Arg Phe Gly Val 
610 

(2) INFORMATION FOR SEQ ID NO: 10: 

U) SEQUENCE CHARACTERISTICS: 
• ■ (A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Val He Gly Ala Gly Ser Tyr 
10 15 

Arg Asn Gly His Glu Val Val 
30 

Ala Thr Leu -Glu Arg Asp Arg 
45 

Pro Phe Pro Asp Thr Leu His 
60 

Ala Ala Ser Arg Asn He Leu 
75 80 

Glu Val Leu Arg Gin He Lys 
90 95 

Val Trp Ala Thr Lys Gly Leu 
110 

Asp Val Ala Arg Glu Ala Leu 
125 

Ser Gly Pro Thr Phe Ala Lys 
140 

lie Ser "Leu Ala Ser Thr Asp 
155 160 



Met Asn Gin Arg Asn Ala Ser Met Thr 
1 5 

Gly Thr Ala Leu Ala He Thr Leu Ala 
20 25 

Leu Trp Gly His Asp Pro Glu His He 
35 40 

Cys Asn Ala Ala Phe Leu Pro Asp Val 
50 55 

Leu Glu Ser Asp Leu Ala Thr Ala Leu 
65 70 

Val Val Val Pro Ser His Val Phe Gly 
85 

Pro Leu Met Arg Pro Asp Ala Arg Leu 

100 105 

Glu Ala Glu Thr Gly Arg Leu Leu Gin 
115 120 

Gly Asp Gin He Pro Leu Ala Val He 
130 135 



Glu. Leu Ala Ala Gly Leu Pro Thr Ala 
145 150 
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Gin Thr Phe Ala Asp Asp Leu Gin Gin Leu Leu His Cys Gly Lys Ser 
165 HO . 175 

Phe Arg Val Tyr Ser Asn Pro Asp Phe lie Gly Val Gin Leu Gly Gly 
180 t 185 190 

Ala Val Lys Asn Val lie Ala [lie Gly Ala Gly Met Ser Asp Gly lie 
195 200 205 

Gly Phe Gly Ala Asn. Ala Arg Thr Ala Leu lie Thr Arg Gly Leu Ala 
210 215 220 

Glu Met Ser Arg Leu Gly Ala Ala Leu Gly Ala Asp Pro Ala Thr -Phe 
225 230 235 240 

Met Gly Met Ala Gly Leu Gly Asp Leu Val Leu Thr Cys Thr Asp Asn 
245 250 255 

Gin Ser Arg Asn Arg Arg Phe Gly Met Met Leu Gly Gin Gly Met Asp 
260 265 270 

Val Gin Ser Ala Gin Glu Lys lie Gly Gin Val Val Glu Gly Tyr Arg 
275 280 265 • • 

Asn Thr Lys Glu Val Arg Glu Leu Ala His Arg Phe Gly Val Glu Met 
290 295 300 

Pro lie Thr Glu Glu lie Tyr Gin Val Leu Tyr Cys Gly Lys. Asn Ala 
305 310 315 320 

Arg Glu Ala Ala Leu Thr Leu Leu Gly Arg Ala Arg Lys Asp Glu Arg 
325 330 335 

Ser Ser His 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 501 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

Ui) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Glu Thr Lys Asp Leu lie Val lie Gly Gly Gly lie Asn Gly Ala 
15 10 15 

Gly lie Ala Ala Asp Ala Ala Gly Arg Gly Leu Ser Val Leu Met Leu 
20 25 30 

Glu Ala Gin Asp Leu Ala Cys Ale Thr Ser Ser Ala Ser Ser Lys Leu 
35 40 45 - 

* He His Gly Gly Leu Arg Tyr Leu Glu His Tyr Glu Phe Arg Leu Val 
50 * 55 60 
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Ser Glu Ala Leu Ala Glu Arg Glu Val Leu Leu Lys Met Ala Pro His 
65 70 75 80 

He Ala Phe Pro Met Arg Phe Arg Leu Pro His Arg Pro His Leu Are 
85 go 



95 



Pro Ala Trp Met lie Arg lie Gly Leu Phe Met Tyr Asp His Leu Glv 
100 - 105 ' no 

Lys Arg Thr Ser Leu Pro Gly Ser Thr Gly Leu Arg Phe Gly Ala Asn 

115 120 IOC 



125 



Ser Val Leu Lys Pro Glu lie Lys Arg Gly Phe Glu Tyr Ser Asp Cys 
130 135 un 



140 



Trp Val Asp Asp Ala Arg Leu Val Leu Ala Asn Ala Gin Met Val Val 
145 150 155 160 

Arg Lys Gly Gly Glu Val Leu Thr Arg Thr Arg Ala Thr Ser Ala Arg 
165 :■ 170 175 

Arg Glu Asn Gly Leu Trp lie Val Glu Ala Glu Asp lie Asp Thr Gly 
180 185 



190 



Lys Lys Tyr Ser Trp Gin Ala Arg Gly Leu Val Asn Ala Thr Gly Pro 
I 95 200 205 

Trp Val Lys Gin Phe Phe Asp Asp Gly Met His Leu Pro Ser Pro Tyr 



215 



220 



Gly lie Arg Leu lie Lys Gly Ser His lie Val Val Pro Arg Val His 
225 230 235 240 

Thr Gin Lys Gin Ala Tyr lie Leu Gin Asn Glu Asp Lys Arg He Val 
2<5 250 255 

Phe Val lie Pro Trp Met Asp Glu Phe Ser lie lie Gly Thr Thr Asp 
2*0 2€5 270 

Val Glu Tyr Lys -Gly Asp Pro Lys Ala Val Lys lie Glu Glu Ser Glu 
275 280 285 

lie Asn Tyr Leu Leu Asn Val Tyr Asn Thr His Phe Lys Lys Gin Leu 
290 295 300 

Ser Arg Asp Asp lie Val Trp Thr Tyr Ser Gly Val Arg Pro Leu Cys 



310 



315 



320 



Asp Asp Glu Ser Asp Ser Pro Gin Ala lie Thr Arg Asp Tyr Thr Leu 
325 330 335 

Asp He His Asp Glu Asn Gly Lys Ala Pro Leu Leu Ser Val Phe Gly 
3<0 345 350 

Gly Lys Leu Thr Thr Tyr Arg Lys Leu Ala Glu His Ala Leu Glu Lvs 
355 360 3€5 
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Leu Thr Pro Tyr Tyr Gin Gly lie Gly Pro Ala Trp Thr Lys Glu Ser 
370 375 380 

Val Leu Pro Gly Gly Ala lie Glu Gly Asp Arg Asp Asp Tyr Ala Ala 
385 390 t 395 400 

Arc Leu Arg Arg Arg Tyr Pro Phe Leu Thr Glu Ser Leu Ala Arg His 
405 410 415 

Tyr Ala Arg Thr Tyr Gly Ser Asn Ser Glu Leu Leu Leu Gly Asn Ala 
420 425 430 

Gly Thr Val Ser Asp Leu Gly Glu Asp Phe Gly His Glu Phe Tyr Glu 
435 440 445 

Ala Glu Leu Lys Tyr Leu Val Asp His Glu Trp Val Arg Arg Ala Asp 
450 ' * 455 460 

Asp Ala Leu Trp Arg Arg Thr Lys Gin Gly Met Trp Leu Asn Ala Asp 
465 470 475 480 

Gin Gin Ser Arg Val Ser Gin Trp. Leu Val Glu Tyr. Thr Gin Gin Arg 
485 490 495 

Leu Ser Leu Ala Ser 
500 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Lys Thr Arg Asp Ser Gin Ser Ser Asp Val He He He Gly Gly 
1 5 10 15 

Gly Ala Thr Gly Ala Gly lie Ala Arg Asp Cys Ala Leu Arg Gly Leu 
20 25 30 

Arq Val He Leu Val Glu Arg His Asp He Ala Thr Gly Ala Thr Gly 
35 40 45 

Arg Asn His Gly Leu Leu His Ser Gly Ala Arg Tyr Ala Val Thr Asp 
50 55 60 

Ala Glu Ser Ala Arg Glu Cys He Ser Glu Asn Gin He Leu Lys Arg 
65 ™ 75 80 

He Ala Arg His Cys Val Glu Pro Thr Asn Gly Leu Phe He Thr Leu 
85 . 90 95 
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Pro Glu Asp Asp Leu Ser Phe Gin Ala Thr Phe He Arg Ala Cys Glu 
100 105 " 



110 



Glu Ala Gly He Ser Ala Glu Ala He Asp Pro Gin Gin. Ala Arg He 
115 12 0 125 

He Glu Pro Ala v.l Asn Pro Ala Leu lie Gly Ala Val Lys Val Pro 
iJ0 ... «5 140 

Asp Gly Thr Val Asp Pro Phe Arg Leu Thr Ala Ala Asn Met Leu Asp 
1<b 150 155 ' " - 16 o 



Ala Lys Glu His Gly Ala Val lie Leu Thr Ala His Glu Val Thr 



165 



no 



175 



Gly 



Leu lie Arg Glu Gly Ala Thr Val Cys Gly Val Arg Val Arg Asn His 



190 



Leu Thr Gly Glu Thr Gin Ala Leu His Ala Pro Val Val Val 



195 



200 



Asn Ala 



205 



Ala Gly He Trp Gly Gin His He Ala Glu Tyr Ala Asp Leu Arg He 



215 



220 



Arg Met Phe Pro Ala Lys Gly Ser Leu Leu He Met Asp His Arg -lie 
230 



235 



240 



Asn Gin His Val lie Asn Arg Cys Arg Lys Pro Ser Asp Ala Asp II* 



250 



255 



Leu Val Pro Gly Asp Thr He Ser Leu He Gly Thr Thr Ser Leu Arg 
260 2 65 270 

He Asp Tyr Asn Glu He Asp Asp Asn Arg Val Thr Ala Glu Glu Val 
275 280 



265 



Asp He Leu Leu Arg Glu Gly Glu Lys Leu Ala Pro Val Met Ala 



295 



300 



Lys 



Thr Arg He Leu Arg Ala Tyr Ser Gly Val Arg Pro Leu Val Ala 



310 



315 



Asp Asp Asp Pro Ser Gly Arg Asn Leu Ser Arg ; Gly He Val Leu 



325 



Ser 
320 



Leu 



33 ° 335 
Asp His Ala Glu Arg Asp Gly Leu Asp Gly Phe He Thr He Thr Gly 

3d O «ajic J 



345 350 
Gly Lys Leu Met Thr Tyr Arg Leu Met Ala Glu Trp Ala Thr Asp Ala 



3*60 



365 



Val Cys Arg Lys Leu Gly Asn Thr Arg Pro Cys Thr Thr Ala Asp Leu 



375 



380 



Ala Leu Pro Gly Ser Gin Glu Pro Ala Glu Val Thr Leu Arg Lys Val 



390 



395 



4 00 
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He Ser Leu Pro Ale Pro Leu Arg Gly Ser Ala Val Tyr Arg His Gly 
405 410 415 

Asp Arg Thr Pro Ala Trp Leu Ser Glu Gly Arg Leu His Arg Ser Leu 
420 , 425 430 

Val Cys Glu Cys Glu Ala Val I Thx "Ala Gly Glu Val Gin Tyr Ala Val 
435 440 445 

Glu Asn Leu Asn Val Asn Ser Leu Leu Asp Leu Arg Arg Arg Thr Arg 
450 455 460 

Val Gly Met Gly Thr Cys Gin Gly Glu Leu Cys Ala Cys Arg Ala Ala 
465 * 470 475 480 

Gly Leu Leu Gin Arg Phe Asn Val Thr Thr Ser Ala Gin Ser lie Glu 
485 490 495 

Gin Leu Ser Thr Phe Leu Asn Glu Arg Trp Lys Gly Val Gin Pro lie 
500 . 505 510 

Ala Trp Gly Asp Ala Leu Arg Glu Ser Glu Phe Thr Arg Trp Val Tyr 
515 520 525 

Gin Gly Leu Cys Gly Leo. Glu Lys Glu Gin Lys Asp Ala Leu 
530 535 540 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 250 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Gly Leu Thr Thr Lys Pro Leu Ser Leu Lys Val Asn Ala Ala Leu 
! 5 10 15 

Phe Asp Val Asp Gly Thr lie He He Ser Gin Pro Ala He Ala Ala 
20 25 30 

Phe Trp Arg Asp Phe Gly Lys Asp Lys Pro Tyr Phe Asp Ala Glu His 
35 40 45 . 

Val He Gin Val. Ser His Gly Trp Arg Thr Phe Asp Ala He Ala Lys 
50 55 60 

Phe Ala Pro Asp Phe Ala Asn Glu Glu Tyr Val Asn Lys Leu Glu Ala 
65 70 75 80 

Glu He Pro Val Lys Tyr Gly Glu Lys Ser He Glu Val Pro Gly Ala 
85 90 95 
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Val Lys Leu Cys Asn Ala Leu Asn Ala Leu Pro Lys Glu Lys Trp Ala 
100 105 110 

Val Ala Thr Ser Gly Thr Arg Asp Met Ala Gin Lys Trp Phe Glu His 
115 120 125 

Leu Gly lie Ara Arg Pro Lys Tyr Phe lie Thr Ala Asn Asp Val Lys 
130 135 

Gin Gly Lys Pro His Pro Glu Pro Tyr Leu Lys Gly Arg Asn Gly Leu 
145 150 - I" 160 

Gly Tyr Pro lie Asn Glu Gin Asp Pro Ser Lys Ser Lys Val Val Val 
165 170 175 

Phe Glu Asp Ala Pro Ala Gly He Ala Ala Gly Lys Ala Ala Gly Cys 
180 185 190 

Lys He He Gly He Ala Thr Thr Phe Asp Leu Asp Phe Leu Lys Glu 
195 200 205 

Lys Gly Cys Asp He He Val Lys Asn~His Glu Ser He Arg Val Gly 
210 215 220 

Gly Tyr Asn Ala Glu Thr Asp Glu Val Glu Phe He Phe Asp Asp Tyr 
225 230 235 240 

Leu Tyr Ala Lys Asp Asp Leu Leu Lys Trp 
245 250 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 271 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii ) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQIDNO:14: 

Met Lys Arg Phe Asn Val Leu Lys Tyr He Arg Thr Thr Lys Ala Asn 
15 10 15 

He Gin Thr lie Ala Met Pro Leu Thr Thr Lys Pro Leu Ser Leu Lys 
20 25 30 

He Asn Ala Ala Leu Phe Asp Val Asp <Sly Thr He lie He Ser <51n 
35 40 45 

Pro Ala He Ala Ala Phe Trp Arg Asp Phe Gly Lys Asp Lys Pro Tyr 
<S0 "55 60 

Phe Asp Ala Glu His Val He His He Ser His Gly Trp Arg Thr Tyr 
%S 70 - 75 80 
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Asp Ala He Ale Lys Phe Ale Pro Asp Phe Ala Asp Glu Glu Tyr Val 
85 90 95 

Asn Lys Leu Glu Gly Glu He Pro Glu Lys Tyr Gly Glu His Ser He 
100 i 105 HO 

Glu Val Pro Gly Ala Val Lys t Leu Cys Asn Ala Leu Asn Ala Leu Pro 
115 120 125 

Lys Glu Lys Trp Ala Val Ala Thr Ser Gly Thr Arg Asp Met Ala Lys 
130 135 HO 

Lys Trp Phe Asp He Leu Lys He Lys Arg Pro Glu Tyr Phe He Thr 
. K5 150 155 160 

Ala Asn Asp Val Lys Gin Gly Lys Pro His Pro Glu Pro Tyr Leu Lys 
165 1^0 175 

Gly Arg Asn Gly Leu Gly Phe Pro He Asn Glu Gin Asp Pro Ser Lys 
180 185' 190 

Ser Lys Val Val Val Phe Glu Asp Ala Pro Ala Gly He Ala Ala Gly 
195 200 205 

Lys Ala Ala Gly Cys Lys He Val Gly lie Ala Thr Thr Phe Asp Leu 
210 215 220 

Asp Phe Leu Lys Glu Lys Gly Cys Asp He He Val Lys Asn His Glu 
225 230 235 240 

Ser He Arg Val Gly Glu Tyr Asn Ala Glu Thr Asp Glu Val Glu Leu 
245 250 255 

He Phe Asp Asp Tyr Leu Tyr Ala Lys Asp Asp Leu Leu Lys Trp 
260 265 270 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 709 amino acids 
(8) TYPE: amino acid 
tC) STRANDEDNESS: unknown 
(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Phe Pro Ser Leu Phe Arg Leu Val Val Phe Ser Lys Arg Tyr He 
1 5 10 15 

Phe Arg Ser Ser Gin Arg Leu Tyr Thr Ser Leu Lys Gin Glu Gin Ser 
20 25 30 

Arg Met Ser Lys He Met Glu Asp Leu Arg Ser Asp Tyr Val Pro Leu 
35 <0 45 
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lie Ala Ser He Asp Val Gly Thr Thr Ser Ser Arg Cys He Leu Phe 
50 55 -60 

Asn Arg Trp Gly Gin Asp Val Ser Lys His Gin He Glu Tyr Ser Thr 
70. 75 80 

Ser Ala Ser Lys Gly Lys lie Gly Val Ser Gly Leu Arg Arg Pro Ser 
85 90 95 

Thr Ala Pro Ala Arg Glu Thr Pro Asn Ala Gly Asp lie Lys Thr Ser 
100 105 , i 10 

Gly Lys Pro he Phe Ser Ala Glu Gly Tyr Ala lie Gin Glu Thr Lys 
115 120 125 

Phe Leu Lys He Glu Glu Leu Asp Leu Asp Phe His Asn Glu Pro Thr 
.' 130 135 140 ^ 

Leu Lys Phe Pro Lys Pro Gly Trp Val Glu Cys His Pro Gin Lys Leu 
145 - 150 155 i 6 o 

Leu Val Asn Val Val Gin Cys Leu Ala Ser Ser Leu Leu Ser Leu Gin 
1*5 170 175 

Thr lie Asn Ser Glu Arg Val Ala Asn Gly Leu Pro Pro Tyr Lys Val 
I 80 185 190 

lie Cys Met Gly He Ala Asn Met Arg Glu Thr. Thr He Leu Trp Ser 
195 200 205 

Arg Arg Thr Gly Lys Pro He Val Asn Tyr Gly He Val Trp Asn Asp 
210 215 220 

.Thr Arg Thr He Lys He Val Arg Asp Lys Trp Gin Asn Thr Ser Val 
225 230 235 240 

Asp Arg Gin Leu Gin Leu Arg Gin Lys Thr Gly Leu Pro Leu Leu Ser 
24 5 250 255 

Thr Tyr Phe Ser Cys Ser Lys Leu Arg Trp Phe Leu Asp Asn Glu Pro 
«0 2£5 270 

Leu Cys Thr Lys Ala Tyr Glu Glu Asn Asp Leu Met Phe Gly Thr Val 
275 280 2€5 

Asp Thr Trp Leu He Tyr Gin Leu Thr Lys Gin Lys Ala Phe Val Ser 
290 295 300 



Asp Val Thr Asn Ala Ser Arg Thr Gly Phe Met Asn Leu Ser Thr Leu 
305 310 315 320 

Lys Tyr Asp Asn Glu Leu Leu Glu Phe Trp Gly He Asp Lys Asn Leu 
325 330 335 . 

He His Met Pro Glu He Val Ser Ser Ser Gin Tyr Tyr Gly Asp Phe 
3«0 345 350 
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Gly He Pro Asp Trp lie Met Glu Lys Leu His Asp Ser Pro Lys Thr 

355 360 365 

Val Leu Arg Asp Leu Val Lys Arg Asn Leu Pro lie Gin Gly Cys Leu 

370 375 380 

Gly Asp Gin Ser Ala Ser Met Val Gly Gin Leu Ala Tyr Lys Pro Gly 

. 390 395 400 



385 



Ale Ala Lys Gys Thr Tyt Gly Thr Gly Cys Phe Leu Leu Tyr Asn Thr 
405 410 415 

Gly Thr Lys Lys Leu lie Ser Gin His Gly Ala Leu Thr Thr Leu Ala 
420 <25 430 

Phe Trp Phe Pro His Leu Gin Glu Tyr Gly Gly Gin Lys Pro Glu Leu 
435 440 445 

Ser Lys Pro His Phe Ala Leu Glu Gly Ser Val Ala Val Ala Gly Ala 
450 .455 460 

Val Val Gin Trp Leu Arg Asp Asn Leu Arg Leu He Asp Lys Ser Glu 
465 470 475 480 

Asp Val Gly Pro lie Ala Ser Thr Val Pro Asp Ser Gly Gly Val Val 
*" 485 490 495 

Phe Val Pro Ala Phe Ser Gly Leu Phe Ala Pro Tyr Trp Asp Pro Asp 
500 505 510 

Ala Arg Ala Thr He Met Gly Met Ser Gin Phe Thr Thr Ala Ser His 
515 • 520 525 

lie Ala Arg Ala Ala Val Glu Gly Val Cys Phe Gin Ala Arg Ala lie 
530 535 540 

Leu Lys Ala Met Ser Ser Asp Ala Phe Gly Glu Gly Ser Lys Asp Arg 



545 



550 



555 



560 



Asp Phe Leu Glu Glu lie Ser Asp Val Thr Tyr Glu Lys Ser Pro Leu 
565 570 575 

Ser Val Leu Ala Val Asp Gly Gly Met Ser Arg Ser Asn Glu Val Met 
580 585 590 

Gin lie Gin Ala Asp lie Leu Gly Pro Cys Val Lys Val Arg Arg Ser 
595 600 605 

Pro Thr Ala Glu Cys Thr Ala Leu Gly Ala Ala lie Ala Ala Asn Met 
610 615 620 

Ala Phe Lys Asp Val Asn Glu Arg Pro Leu Trp . Lys Asp Leu His Asp 
625 ' 630 635 640 

Val Lys Lys Trp Val Phe Tyr Asn Gly Met Glu Lys Asn Glu Gin He 
645 650 655 
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Ser Pro Glu Ala His Pro Asn Leu Lys lie Phe Arg Ser Glu Ser Asp 
660 665 670 

Asp Ale Glu Arg Arg Lys His Trp Lys Tyr Trp Glu Val Ala Val Glu 
675 680 685 ' 

Arg Ser Lys Gly Trp Leu Lys .Asp He Glu Gly Glu His Glu Gin Val 
690 695 1 700 

Leu Glu Asn Phe Gin 
705 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear. 

(ii) MOLECOLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:16: 
GCGCGGATCC AGGAGTCTAG AATTATGGGA TTGACTACTA AACCTCTATC T 51 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GATACGCCCG GGTTACCATT TCAACAGATC GTCCTT 36 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

TTGATAATAT AACCATGGCT GCTGCTGCTG ATAG 34 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
• (A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19: 
GTATGATATG TTATCTTGGA TCCAATAAAT CTAATCTTC 39 
(2) INFORMATION FOR SEQ ID N0:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
CATGACTAGT AAGGAGGACA ATTC 24 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CATGGAATTG TCCTCCTTAC TAGT 2< 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 



(ii) 



MOLECULE TYPE: DNA (genomic) 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO:22: : 



CTAGTAAGGA GGACAATTC 



19 
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(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic, acid- 
(Cj STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) ' MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CATGGAATTG TCCTCCTTA 
(2) INFORMATION FOR SEQ ID NO:2<: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc •« "PRIMER" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
GATCCAGGAA ACAGA 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
tA) DESCRIPTION: /desc - "PRIMER" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CTAGTCTGTT TCCTG 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRule \3bis) 



A- The indications made below relate to the microorganism referred lo in the description 


.. rr 3 .lines 21 and 22 


B. IDENTIFICATION OF DEPOSIT 


Further deposits are identified on an additional shed Q 


Name of depositary institution 




AMERICAN TYPE CULTURE COLLECTION 




Address of depositary institution (including postal code and country) 


12301 Parklawn Drive 




Rockville, Maryland 20852 




US 




Date of deposit 


Accession Number 


26 September 1996 


98187 


C ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet [J 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the- date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(4) EPC) 

D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications art not for all designated Statu) 



£. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) • 

Tnc indications listed below will be subm itted to the International Bureau later (specif the general nature of the indicatiofu c*. 'Accession 
Number ofDeposiT) 



For receiving Office use only 



| | This sheet was received with the international application 



Authorized officer 



For International Bureau use only 



| | This sheet was received by the International Bureau on: 



Authorized officer 



FormPCT/RO/134 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRute!36tf) 



A. The indications made below relate to the microorganism referred to in the description 

mpa ^ 3 - fc*s 21 and 22 



B. IDENTIFICATION OF DEPOSIT P " ™ ~ ~=r 

_ • Further deposits are identified on an additional shed 



Name of depositary institution 
AMERICAN TYPE CULTURE COLLECTION 



Address of depositary institution (including postal cod* and country) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
US 



Date of deposit 

06 November 1996 



Accession Number 

98248 



C ADDITIONAL INDICATIONS (lew blank ,/„„ applicable) Thb information is continued on an additional shed Q 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available untS 
the publication of the mention of the grant of the European patent or 

or S J f I °k been refused or witS™ 

or is deemed to be withdrawn, only by the issue of such a sample toil 
expert nominated by the person requesting the sample. (Rule^8(I) SlS 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE fifth* indications are not for aO designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leavt blank (f not applicable) 



to^bet'tfS^* Wi " * * ^ InLCmatIDmd P «~» (^th r general nature cfthe indicator* e+ 'Accession' 



I | This sheet was received with the international application 


1 1 This shed was received by the International Bureau on: 


Authorized officer 
Form PCT/RO/ 134 (Jury 1992) 


Authorized officer 
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WHAT IS CLAIMED IS: 

1 . A method for the production of glycerol from a recombinant 

organism comprising: 

(i) transforming a suitable host cell with an expression cassette 

comprising either one or both of I 

(a) a gene encoding a glycero!-3-phosphate dehydrogenase 



enzyme; 



(b) a gene encoding a glycerol-3-phosphale phosphatase 



enzyme; 

j o (ii) culturing the transformed host cell of (i) in the presence of a! 

least one carbon source selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and single-carbon substrates, whereby glycerol 

is produced; and 

(iii) recovering the glycerol produced in (ii). 
j 5 2. A method according to Claim 1 wherein the expression cassette 

comprises a gene encoding a glycerol-3-phosphate dehydrogenase activity. 

3. A method according to Claim 1 wherein said expression cassette 
comprises a gene encoding a glycerol-3-phosphale phosphatase activity. 

4. A method according to Claim 1 wherein said expression cassette 
20 comprising gene encoding a glycerol-3 -phosphate phosphatase activity and a 

glycerol- 3 -phosphate dehydrogenase activity. 

5. A method according to Claim 1 wherein the suitable host cell is 
selected from the group consisting of bacteria, yeast, and filamentous fungi 

6. A method according to Claim 5 wherein the suitable host cell is 
25 selected from the group consisting of Citrobacter, ErUerobacter, Clostridium, 

Klebsiella, Aerobacter, Lactobacillus, Aspergillus, Saccharomyces, 
Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hcmsenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, 
Salmonella, Bacillus, Streptomyces, and Pseudomonas. 
30 7. A method according to Claim 6 wherein the suitable host cell is 

E. coli or Saccharomyces. 

8. A method according to Claim 1 wherein the carbon source is 

glucose. 

9. A method according to Claim 1 wherein the gene encoding a 
35 glycerol-3-phosphate dehydrogenase enzyme corresponds to the amino acid 

sequence given in SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO 10, 
SEQ ID NO: 11, or SEQ ID NO: 12 and wherein the amino acid sequence 
encompasses amino acid substitutions, deletions or insertions that do not alter the 
functional properties of the enzyme. 
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10. A method according to Claim 1 wherein the gene encoding a 
gIyceroI-3-phosphatase enzyme corresponds to the amino acid sequence given in 
SEQ ID NOM3 or SEQ ID N0.14 and wherein the amino acid sequence may 
encompass amino acid substitutions, deletions or additions that do not alter the 

5 function of the enzyme. ~ 

11. A method according to Claim 1 wherein the gene encoding a 
glycerol kinase enzyme corresponds to the amino acid sequence given in 
SEQ ID NO:15 and wherein said amino acid sequence may encompass amino acid 
subsutuuons, deletions or additions that do not alter the function of said enzyme. 

12. A transformed host cell comprising a gene encoding a glycerol-3, 
phosphate dehydrogenase activity. 

13. A transformed host cell comprising a gene encoding a glycerol-3- 
phosphate phosphatase activity. 

14. A method for selecting for glycerol-3 -phosphate dehydrogenase gene 
express,on by complementation comprising supplying glycerol or glycerols 
phosphate to a strain auxotrophic for glycerol or glycerol-3-phosphate by virtue of 
a mutation in glycerol-3 -phosphate dehydrogenase gene of the strain. 

15. A method for selecting for glycerol-3-phosphate dehydrogenase gene 
expression by complementation comprising supplying salt to a strain 
osmosensitive by virtue of a mutation in gene for glycerol-3-phosphate 
dehydrogenase of die strain. 

16. An Escherichia coli pAH2 l/DH5o containing the GPP2 gene and 
identified by the designation ATCC 98187. 

17. An Escherichia coli P DAR1A/AA200 containing the DARl gene 
25 and identified by the designation ATCC 98248. 
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